Hello everyone,
My profiling results show that NetCDF Classic is very slow if the
following output scheme is used:
// case 1
for var in variables
define var
write var
endfor
or even if something like this is done (assuming that all the
variables are defined already and they depend on an unlimited
dimension):
// case 2
for var in variables
append var
endfor
It seems to me that case 1 is slow because NetCDF (Classic) keeps the
file header as small as possible (Section 4 of the NetCDF User's Guide
is perfectly clear about this). Case 2, on the other hand, seems to be
slow because (please correct me if I'm wrong) variables are stored
contiguously. (In other words: if variables A and B are defined in
this order, then appending X bytes to A requires moving B over by X
bytes.)
My question is:
How does NetCDF-4 compare to NetCDF Classic in this regard? Would
switching to it improve write performance? (This is two questions,
really: I'm interested in cases 1 and 2 separately.)
I would like to avoid re-factoring our code to do
for var in variables
define var
endfor
for var in variables
write var
endfor
instead of what is described as "case 1" above.
Thank you!
--
Constantine Khroulev
PISM (www.pism-docs.org) Developer/Maintainer
University of Alaska Fairbanks