Re: [netcdfgroup] Patch for netCDF4 file bit-for-bit reproducibility

Removal or disabling of tracking times would be useful for my simplistic
whole-file checksum procedure. Thanks to both Rimvydas and Unidata for your
efforts on this.

--Dave


On Mon, Feb 10, 2014 at 2:55 PM, Russ Rew <russ@xxxxxxxxxxxxxxxx> wrote:

> Hi Rimvydas,
>
> > Recently I started to work with netcdf in fortran, mainly changing f77
> > interface to more flexible f90 one.
> > And I love it! Fantastic API.
> >
> > I am dealing with code that has huge testsuite for regression testing,
> > so I am trying to found compromise for size and speed.
> > Code was intended to output lots of diagnostics (~1Gb) for every test.
> >
> > Lack of ncdiff tool made me to write my one, but while trying to
> > optimize it for time
> > I learned that half the time I am spending in my comparison loops,
> > other half in swap8b...
> > NETCDF4 features like compression and native endianess are very appealing
> > but lack of BFB (even with nccopy) just because of internal timestamping
> is s
> > ad.
> >
> >
> http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2008/msg00003
> .
> > html
> > Is this still valid?
>
> It looks like the fix you've developed and tested (first suggested by
> Quincey Koziol in January 2008) would make bit-for-bit reproducibility
> possible for netCDF-4 files.
>
> We're testing turning off tracking times for HDF5 objects to determine
> if there are any undesirable side effects.  If not, we'll incorporate it
> into the next release.
>
> Thanks for bringing this to our attention!
>
> --Russ
>
> > I am attaching small patch that I made on netcdf-4.1.3.
> > Just with this patch and these configure options I successfully can
> > reproduce identical files
> > using nccopy not depending on system time or having to relay on some
> > hooks for get unix time.
> > CPPFLAGS="-I$(hdf}/include"
> > CFLAGS="-DBFB_MODE"
> > LDFLAGS="-L${hdf}/lib -ldl"
> > ./configure --prefix=${netcdf} --enable-netcdf-4 --disable-hdf4
> > --disable-pnetcdf --enable-cdmremote=no --disable-dap --disable-v2
> > --disable-shared --with-pic
> >
> > In source code there isn't more of H5P_[A-Z]+_CREATE calls (except for
> > ones in tests)
> >
> > Is is safe enough to be used for reproducibility checks at least with
> > netcdf3/netcdf4 classic format?
> > All I need is to be able to use md5sum on repeated runs to speed up
> > the process with the same netcdf/hdf lib.
> >
> > Best regards,
> > Rimvydas
>
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> http://www.unidata.ucar.edu/mailing_lists/
>
  • 2014 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: