Re: [netcdfgroup] PDL::NetCDF slow since netcdf-4.1.0

  • To: Heiko Klein <Heiko.Klein@xxxxxx>
  • Subject: Re: [netcdfgroup] PDL::NetCDF slow since netcdf-4.1.0
  • From: Doug Hunt <dhunt@xxxxxxxx>
  • Date: Tue, 5 Mar 2013 10:35:39 -0700 (MST)
Hi Heiko: Since it seems that nc_sync did not do much before netcdf-4.1, why don't we just get rid of the calls to nc_sync in the put* routines in PDL::NetCDF. That would keep the code a bit cleaner, since no one is really expecting anything from the nc_sync calls.

It would make sense to add a 'sync' routine which calls nc_sync that is separately callable so a PDL user can easily get this behavior if needed.

Do you want to do this, or should I?

Thanks,

  Doug

dhunt@xxxxxxxx
Software Engineer
UCAR - COSMIC, Tel. (303) 497-2611

On Tue, 5 Mar 2013, Russ Rew wrote:

Hi Doug,

Hi Heiko:  I think these calls to nc_sync have been there a long time.
I don't recall the original reason for them.  Before netcdf version
4.1 was nc_sync just a no-op?  If this is the case, then maybe we should
put in an AUTOSYNC option with the default = 0 (do not sync).

If the netcdf group has ideas about the utility of nc_sync before netcdf
version 4.1, then perhaps we should add the AUTOSYNC option with default =
1 (do sync).

Another alternative would be to remove all calls to nc_sync and then make
available and advertise a sync method in PDL::NetCDF.

NetCDF group:  Was nc_sync useful before netcdf version 4.1?

Not sure about how useful it was, but there were complaints about not
having a function in the netCDF API that called fsync().  See, for
example, this posting, and other parent postings in the same thread, if
you're interested in the discussion and use case leading up to adding
fsync() to nc_sync() unless configured with "--disable-fsync":

 
http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2009/msg00411.html

It was added to version 4.1, released on 2010-01-30, after asking for
comments and tests of release candidates.

We welcome comments about whether the default is wrong and why.  It
would be possible to change the upcoming 4.3.0 release to require
configuring with "--enable-fsync" to get the fsync() call in nc_sync(),
if there is a compelling reason why this would improve netCDF for most
users.

--Russ

On Tue, 5 Mar 2013, Heiko Klein wrote:

Hi Doug,

we just upgraded to Ubuntu Precise (12.04) and have for the first time a
netcdf version >= 4.1. With this upgrade PDL::NetCDF became awfully slow wh
en
writing data, in particularly when writing small amounts of data.

Reason for that is that netcdf now calls 'fsync' when nc_sync is called.
Syncing the complete filesystem is very costy and I don't really understand

why the netcdf-folks did that by default (it might make sense in some HPC
filesystems - and fsync is not available from FORTRAN). It can be disabled
in
build-time, but who really does that - most people just don't use nc_sync,
in
particular since 'close' does this automatically.

But PDL::NetCDF calls nc_sync automatically after each put*. I would like t
o
just remove the nc_sync calls from PDL::NetCDF, and let users call them
manually if they really need syncronisation. If you oppose to that, I would

like to put a flag to new: (AUTOSYNC => 0|1) (with default to 1). What do y
ou
think?

Best regards,

Heiko

--
Dr. Heiko Klein                              Tel. + 47 22 96 32 58
Development Section / IT Department          Fax. + 47 22 69 63 55
Norwegian Meteorological Institute           http://www.met.no
P.O. Box 43 Blindern  0313 Oslo NORWAY


_______________________________________________
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: http://www.unidata.ucar.edu/m
ailing_lists/




  • 2013 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: