2008 Unidata NetCDF Workshop for Developers and Data Providers > Formats and Performance
7.5 Issues for Discussion
Discussion of some format and performance issues.
Below are a few excerpts from some insightful reviewers comments on a recent
netCDF standardization proposal. Discussing issues these
comments raise provides an opportunity to explore some of the
tradeoffs in providing infrastructure for scientific data access.
- "A basic test for data file self-consistency, the absence of a
checksum capability, prevents netCDF from being an acceptable archival
format."
- "... it can be quite time consuming to read
time series that are stored in a record variable in a large
dataset."
- "Another issue is the limitation on variable size (even in the 64
bit offset variant) to ~4GB/record. I believe this will be
problematic in the future as the size of variables (especially
from model data) grows ..."
- "The netCDF convention is built to be very generic, which is great
as a data format. However, as a community standard it may be
overly broad. We need some additional
convention/standard/guidelines on the netCDF file."
- "It is also easy for users to create NetCDF files without
incorporating full metadata information inside the
files. Operationally, this makes it difficult for archival and
distribution."
- "In practice, netCDF depends on community conventions to be
completely useful. For example, the use of coordinate variables
is a convention, rather than being in the specification. I do
think this is a weakness of the standard - newcomers will not
infrequently create netCDF files thorough the API that are
essentially "unusable" because they conform to the library API
but not to the community conventions."
2008 Unidata NetCDF Workshop for Developers and Data Providers > Formats and Performance