Performance with netCDF-4/HDF5 is better than performance with binary
files, when settings are correct.
The biggest problem is usually compression. Are you compressing your data
(i.e. did you use nc_def_var_deflate()?) If so, turn that off for a while
and get your performance sorted out without compression.
Then you can turn on compression and decide if the performance hit is worth
the compression.
Your chunksizes sound small. Try much bigger ones. Also if you build the C
library with --enable-benchmarks there is a program nc_perf/bm_file.c which
will rewrite any data file into one with different chunksizes, compression,
and other settings. A line of CSV timings will be output. By running
bm_file from a script, you can try a variety of chunksizes and other
settings and get a nice CSV output file that you can put in excel for easy
graphing of results.
If none of this helps, send me a copy of the file and I'll take a look...
Keep on netCDFing!
Ed Hartnett
On Mon, Dec 2, 2019 at 8:02 AM Amr, Mahmoud <mahmoud.amr@xxxxxxxxxxxxxxxxx>
wrote:
> Dear netcdf community,
>
>
>
> recently we switched from our „own“ file format (data saved linear in
> “primary” direction) to netcdf for saving 3d ct voxel data in the hopes of
> improving performance when accessing the data from other dimensions, for
> example getting slices with YZ view instead of XY. The Data is way too
> large for memory, so we load them slice by slice using nc_get_vara.
>
>
>
> In our recent attempts using uint16 voxel data with example dimensions of
> 6000x6000x3000 and chunk sizes of 64x64x64, loading one slice into chunk
> cache took 5 seconds and loading slices from the chunk cache until the next
> set of chunks has to be read took 1 second per slice. The chunk cache is
> parameterized to be large enough to hold “at least” enough chunks for a
> slice. We are using Win10 systems with NvME SSDs (~3200Mb/s read).
>
>
>
> This seems incredibly slow to me, especially when the data is already in
> the chunk cache. It seems like the CPU utilization is not very good and the
> disk does nothing as long as the chunk cache is filled.
>
>
>
> Is this expected performance from your experience or are we doing
> something really wrong? We already tried different chunk sizes and all
> other chunk sizes gave us even worse speed. We are using the precompiled C
> library.
>
>
>
> Thanks in advance
>
>
>
>
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web. Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> https://www.unidata.ucar.edu/mailing_lists/
>