Re: [netcdfgroup] read performance slow compared to netCDF on other systems

I definitely appreciate the suggestions and help that folks have provided
thus far. I think I've made some progress narrowing down what the issue
might be. This week I've trying different chunking policies and maps with
NCO, and I also starting trying converting file formats. I think I've found
that the slow performance only occurs when using the newer NetCDF4+HDF5
formats.

Using the "classic" and "64-bit-offset" formats, my test ncks operation's
performance on our new cluster is _much_ closer to the other systems.
Interestingly, using the "classic" format in my test ncks operation on our
Cray resulted in cutting the run time almost in half. Here's an example
from the new cluster.

 loforbes$ ls -l test.nc
 -rw------- 1 loforbes staff 661565784 Nov 16 13:48 test.nc

Unconverted:
loforbes$ time ncks test.nc out.nc

real    0m35.895s
user    0m29.272s
sys    0m1.815s

Converted to "classic":
loforbes$ time nccopy -k classic test.nc chinook_classic.nc

real    0m6.185s
user    0m2.170s
sys    0m3.875s
loforbes$ time ncks chinook_classic.nc out_classic.nc

real    0m4.724s
user    0m0.658s
sys    0m3.908s

loforbes$ ls -l *.nc
[...cut...]
-rw------- 1 loforbes staff 661302172 Nov 18 10:38 chinook_classic.nc
[...cut...]
-rw------- 1 loforbes staff 661302240 Nov 18 10:41 out_classic.nc
[...cut...]

I also noticed that no matter which format I converted the file to, the
time to run nccopy was pretty similar, between 6 and 7 seconds.

So these two things make me start to think the issue is at the HDF5 level.
Could that be right? Like NetCDF when I started exploring this, I don't
know much about HDF5. Is this the right place to ask for further
suggestions? Should I continue trying to modify the chunking, but using
HDF5 utilities instead of NCO utilities?

I've already asked the researchers if there's anything about their data
that requires using NetCDF4+HDF5 as the format. From reading the NetCDF
documentation, I assume the fact that I was able to convert the file when
making the copy means there isn't something specific to the newer format(s)
in the data.

Have a good weekend.
-- 
Regards,
-liam

-There are uncountably more irrational fears than rational ones. -P. Dolan
Liam Forbes  loforbes@xxxxxxxxxx  ph: 907-450-8618 fax: 907-450-8601
UAF Research Computing Systems Senior HPC Engineer  LPIC1, CISSP
  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: