Re: [netcdfgroup] Wondering about when NetCDF data hits the disk...

To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] Wondering about when NetCDF data hits the disk...
From: Quincey Koziol <koziol@xxxxxxxxxxxx>
Date: Thu, 5 Nov 2009 08:34:29 -0600

Hi all,

Sorry to jump into this discussion so late, but thought I would addsome HDF5 information...


On Oct 30, 2009, at 1:00 PM, netcdfgroup-request@xxxxxxxxxxxxxxxx wrote:

----------------------------------------------------------------------

Message: 1
Date: Thu, 29 Oct 2009 19:51:28 +0100
From: Thomas Orgis <thomas.orgis@xxxxxx>
To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] Wondering about when NetCDF data hits the
        disk...
Message-ID: <20091029195128.000074de@rs2>
Content-Type: text/plain; charset=US-ASCII

Am Thu, 29 Oct 2009 04:53:31 -0600
schrieb Ed Hartnett <ed@xxxxxxxxxxxxxxxx>:
Can I suggest you install zlib, HDF5, and netCDF-4, and try this
experiment with netCDF-4/HDF5 files?

Once you get netCDF-4 installed, change your create mode flag (in
nc_create) to include NC_NETCDF4, and build and run your program.
OK, then... I got my build with NetCDF/HDF now ... static NetCDF ...ran the same experiment ...
And it's a lot worse, even! I observe no data update during waitingtime of several seconds (about ten, with several added recordsduring that time) and prompt reaction when issing sync on the writermachine. That feels just like nc_sync(), buffering-wise. Is the HDFdocumentation also slightly wrong there?Now I get to the "worse" part: HDF5 storate seems to be a lot lessrobust against this kind of abuse (reading partially written files).Where the data extraction program just gets rubbish / partial datawhen the data is not fully there yet for normal NetCDF, it returns ahard error and thus my extraction program bails out with HDF-basedstorage. This as such is not really something to blame it for, onecan argue that it is preferrable to return an error (that I couldignore and try again after some waiting time in this application).But this error situation occurs almost immediately, looks like HDF5is more concerned with self-consistency, or just more complicated...At least plain NetCDF is robust against reading an old record whileanother process is adding a new record... I have the impression thateven that is not good with
Thinking about it... if this error return is reliable, handling itin the data extractor would be a solution for the broken plots, yes.Not for the timely update, though. Main message is: No, I do not seedifferent caching behaviour with NetCDF-4 format.
Hm... playing around a bit: No, that error return is not reliable. Iguess HDF only stumbles over the header being updated, but theactual data portion can be incomplete as for plain NetCDF. So, Irather prefer the old behaviour of never erroring out, just printingthe funny numbers sometimes.

Under normal operation, HDF5 flushes metadata from its cache under aLRU scheme, where metadata that hasn't been used in a long time willbe flushed to the file to make room for other incoming metadata.Because those pieces of metadata may update a larger data structure (B-tree, heap, etc), it's easily possible (likely, even) to getinconsistent data structures in the file and the HDF5 library bailsout when it detects those circumstances.

We are working on a new feature for "single-writer/multiple-reader" (SWMR) access which will order the metadata flushes so thatapplications reading the file while it's being updated by a writerwill always get a consistent view of the file (if potentially somewhatout of date). This is currently working for appending records todatasets, a principal netCDF use case, and will eventually be extendedto all metadata operations on a file. Snapshots which produce"unstable" file format files (i.e. don't keep them!) are availablenow, if people would like to test.

In netcdf-4 I call H5flush, which promises to flush buffers do disk.


Promise broken?

Well, we flush our buffers when H5Fflush is called, but if the writerdoesn't stop changing the metadata in the file while the reader comesin, the data structures on disk will get out of date again soon.

Alrighty then,

Thomas.
PS: Yes, I could be save from HDF5 consistency errors by employingsome mutex / file lock between writer and reader... but well, goodold NetCDF seems to be more robust as it is, even abused unsafely.


        Currently, that's the best way to address this issue with HDF5.

--
Dipl. Phys. Thomas Orgis
Atmospheric Modelling
Alfred-Wegener-Institute for Polar and Marine Research



------------------------------

Message: 3
Date: Fri, 30 Oct 2009 11:19:35 -0600
From: Orion Poplawski <orion@xxxxxxxxxxxxx>
To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] Wondering about when NetCDF data hits the
        disk...
Message-ID: <4AEB2027.5050109@xxxxxxxxxxxxx>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 10/30/2009 10:26 AM, Ed Hartnett wrote:

Thomas Orgis<thomas.orgis@xxxxxx>  writes:

Am Thu, 29 Oct 2009 04:53:31 -0600
schrieb Ed Hartnett<ed@xxxxxxxxxxxxxxxx>:
In netcdf-4 I call H5flush, which promises to flush buffers dodisk.
Promise broken?


Oh well, it was worth a try.


I don't see any fsync() calls in hdf5 either.

We don't make any fsync() calls from the H5Fflush() API call (sinceit seemed additional to the "flush" scope of the routine). We haveconsidered adding a "SYNC" flag to H5Fflush() which would make thatcall though - is that something developers would like to see?


        Quincey

2009 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: