[netcdfgroup] Confusing documentation on unlimited dimension and chunk-size

Hi,

the netcdf user guide describes the new chunking possibilities in http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html#Default-Chunking

I think the paragraf:

'For unlimited dimensions, a chunk size of one is always used. Users are advised to set chunk sizes for large data sets with one or more unlimited dimensions, since a chunk size of one is quite inefficient.'

is very misleading, and opposed to 'In particular, the idea of using 1 for the chunksize of an unlimited dimension works well if the data are being read a record at a time. Any other read access patterns will result in slower performance.'


In netcdf-3, it was usual to use a record-base read access pattern when an unlimited dimension was found. Advising users now to change the chunk size of the unlimited dimension to something different than 1 is in most cases wrong and will give slower performance. I suggest a sentence like:

'For unlimited dimensions, a chunk size of one is always used. For large datasets, where the size of limited dimensions is small compared to the unlimited dimensions, users are advised to avoid unlimited dimensions or to increase the chunk sizes of the unlimited dimensions. Be aware that an unlimited dimension with chunksize != 1 will result in slower performance for record-oriented access patterns which where common in netcdf-3.'


Best regards,

Heiko

--
Dr. Heiko Klein                              Tel. + 47 22 96 32 58
Development Section / IT Department          Fax. + 47 22 69 63 55
Norwegian Meteorological Institute           http://www.met.no
P.O. Box 43 Blindern  0313 Oslo NORWAY



  • 2013 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: