[netcdfgroup] Performance Question using nc_get_vara

To: "netcdfgroup@xxxxxxxxxxxxxxxx" <netcdfgroup@xxxxxxxxxxxxxxxx>
Subject: [netcdfgroup] Performance Question using nc_get_vara
From: "Amr, Mahmoud" <mahmoud.amr@xxxxxxxxxxxxxxxxx>
Date: Mon, 2 Dec 2019 09:39:12 +0000

Dear netcdf community,

recently we switched from our "own" file format (data saved linear in "primary" 
direction) to netcdf for saving 3d ct voxel data in the hopes of improving 
performance when accessing the data from other dimensions, for example getting 
slices with YZ view instead of XY. The Data is way too large for memory, so we 
load them slice by slice using nc_get_vara.

In our recent attempts using uint16 voxel data with example dimensions of 
6000x6000x3000 and chunk sizes of 64x64x64, loading one slice into chunk cache 
took 5 seconds and loading slices from the chunk cache until the next set of 
chunks has to be read took 1 second per slice. The chunk cache is parameterized 
to be large enough to hold "at least" enough chunks for a slice. We are using 
Win10 systems with NvME SSDs (~3200Mb/s read).

This seems incredibly slow to me, especially when the data is already in the 
chunk cache. It seems like the CPU utilization is not very good and the disk 
does nothing as long as the chunk cache is filled.

Is this expected performance from your experience or are we doing something 
really wrong? We already tried different chunk sizes and all other chunk sizes 
gave us even worse speed.  We are using the precompiled C library.

Thanks in advance

Follow-Ups:
- Re: [netcdfgroup] Performance Question using nc_get_vara
  - From: Ed Hartnett

2019 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: