Re: [netcdfgroup] Wondering about when NetCDF data hits the disk...

Am Thu, 29 Oct 2009 04:53:31 -0600
schrieb Ed Hartnett <ed@xxxxxxxxxxxxxxxx>:

> Can I suggest you install zlib, HDF5, and netCDF-4, and try this
> experiment with netCDF-4/HDF5 files?
> 
> Once you get netCDF-4 installed, change your create mode flag (in
> nc_create) to include NC_NETCDF4, and build and run your program.

OK, here I am, trying the same thing with NetCDF4 / HDF5. NetCDF 4.1 daily 
snapshot of... today.

Sidenote: I got into trouble trying to use dynamic NetCDF libs, matching the 
default install of dynamic libs of HDF5. Reasoning was to have dependencies 
automatically pulled in by -lnetcdf (or better, -lnetcdff), just like 
libhdf5.so pulls in libz ... but the NetCDF .so libs miss the linking to the 
HDF5 libs:-(
Plus, I got this while trying to link to the dynamic libs on one system. We 
have x86-64 systems in AMD and Intel flavor... that might be the cause because 
I built the stuff on Intel, then it broke on AMD -- no issue here with static 
libs, or with the HDF5 libs, for that matter:

ld.so.1: dgmodel.bin: fatal: relocation error: R_AMD64_32: file 
/data/scratch3/torgis/system/x86-64/SunOS/studio12/lib/libnetcdff.so.5: symbol 
(unknown): value 0xfffffd7ffec31898 does not fit
bash-3.00$ file 
/data/scratch3/torgis/system/x86-64/SunOS/studio12/lib/libnetcdff.so.5

Seems like we are not PIC-safe here?

OK, then... I got my build with NetCDF/HDF now ... static NetCDF ... ran the 
same experiment ... 

And it's a lot worse, even! I observe no data update during waiting time of 
several seconds (about ten, with several added records during that time) and 
prompt reaction when issing sync on the writer machine. That feels just like 
nc_sync(), buffering-wise. Is the HDF documentation also slightly wrong there?
Now I get to the "worse" part: HDF5 storate seems to be a lot less robust 
against this kind of abuse (reading partially written files). Where the data 
extraction program just gets rubbish / partial data when the data is not fully 
there yet for normal NetCDF, it returns a hard error and thus my extraction 
program bails out with HDF-based storage. This as such is not really something 
to blame it for, one can argue that it is preferrable to return an error (that 
I could ignore and try again after some waiting time in this application). But 
this error situation occurs almost immediately, looks like HDF5 is more 
concerned with self-consistency, or just more complicated... At least plain 
NetCDF is robust against reading an old record while another process is adding 
a new record... I have the impression that even that is not good with 

Thinking about it... if this error return is reliable, handling it in the data 
extractor would be a solution for the broken plots, yes. Not for the timely 
update, though. Main message is: No, I do not see different caching behaviour 
with NetCDF-4 format.

Hm... playing around a bit: No, that error return is not reliable. I guess HDF 
only stumbles over the header being updated, but the actual data portion can be 
incomplete as for plain NetCDF. So, I rather prefer the old behaviour of never 
erroring out, just printing the funny numbers sometimes.

> In netcdf-4 I call H5flush, which promises to flush buffers do disk.

Promise broken?


Alrighty then,

Thomas.


PS: Yes, I could be save from HDF5 consistency errors by employing some mutex / 
file lock between writer and reader... but well, good old NetCDF seems to be 
more robust as it is, even abused unsafely.

PPS: Do people want to be addressed in To:, with CC to list here? I would 
prefer to get the mail solely over the list, as me and my mail client are 
confused otherwise.

-- 
Dipl. Phys. Thomas Orgis
Atmospheric Modelling
Alfred-Wegener-Institute for Polar and Marine Research



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: