Re: [netcdfgroup] please try out the netcdf daily snapshot for opendap, hdf4-reading, hdf5-reading, szip compression, etc.

  • To: <stephen.pascoe@xxxxxxxxxx>
  • Subject: Re: [netcdfgroup] please try out the netcdf daily snapshot for opendap, hdf4-reading, hdf5-reading, szip compression, etc.
  • From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
  • Date: Mon, 19 Oct 2009 08:36:45 -0600
<stephen.pascoe@xxxxxxxxxx> writes:

> I'd like to voice a word of caution on the inclusion of optional szip
> compression in netCDF4.1.  Although I'm sure szip has technically merit
> it further fragments the "netcdf format".  The NetCDF FAQ lists 4 netCDF
> format variants and szip will make that 6:
>
>  1. netCDF3 format
>  2. netCDF3 64-bit offset format
>  3. netCDF4 format (zlib or no compression)
>  4. netCDF4 classic model (zlib or no compression)
>  5. netCDF4 format (szip)
>  6. netCDF4 classic model (szip)
>
> The reason I lump zlib with no compression is that when building NetCDF
> libraries you need to decide whether to include szip in the build, with
> the inherent license limitations it imposes.  Therefore users will have
> netCDF4 tools with or without szip support.  
>
> This is bound to discourage the adoption of netCDF4 in general and could
> cause many data archival problems down the line.  For instance, from my
> perspective I am now asking myself what the CMIP5 archive's policy
> should be on netCDF4 szip.
>
> I'm not sure how to resolve this problem here but it's worth exposing.  

Howdy Stephen!

The multiplicity of binary formats is certainly a concern, but the HDF5
format (which is the base for all the non-classic formats) is a
well-known and stable software product, trusted in its own right by such
organizations as NASA for long-term archive of scientific data.

Restricting CMIP contributors from using szip makes perfect sense. (But
everyone who asked me for it was from the climate community!) Yet the
szlib library is freely available, and license restrictions on
commercial writers of data will not apply to anyone in the CMIP5
community.

The netCDF APIs (C, Fortran, C++, and Java) can all now read files that
no longer fall into the strict format categories. This blurs the lines
considerably. It is going to be up to managers of archives like CMIP to
decide what formats and conventions are acceptable for each archive.

For example, a HDF4 SD file can now be read exactly as if it were a
classic format netCDF file. Assuming the proper CF conventions were used,
this would then work with all software written for CMIP5 data. But I
assume that you don't want people to send you HDF4 files, even though
they are now, arguably, netCDF files too. 

I understand your point that this is the first case in which a user can
produce a file with netCDF that cannot be read by other current netcdf
libraries unless they are built with szlib. But why can't they be
rebuilt, if necessary? This seems similar to the issue of netCDF-4/HDF5
files and pre-existing netCDF installations, which must upgrade their
software to read the files. It is all freely available software.

I have also heard that the szlib patent has expired, and that an
unrestricted version may be available - I will try to run down that
rumor...

Thanks,

Ed

-- 
Ed Hartnett  -- ed@xxxxxxxxxxxxxxxx



  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: