2012 Unidata NetCDF Workshop > Formats and Performance
14.7 Using Less Space for Data
Using less disk space for your data can also reduce access time.
For classic formats:
- Pack floating-point data into a narrower (and lower precision)
numeric type,
for example pack 32-bit floats into 16-bit shorts or 8-bit bytes using
scale_factor
and add_offset
attributes.
The NCO utilities ncpack or ncpdq will pack
variables into smaller types.
-
The netCDF Java library uncompresses files with ".Z", ".zip",
".gzip", ".gz", or
".bz2" extensions and caches the expanded file for subsequent reads,
keeping cache size within a specified limit. This works well in
read-only applications like servers.
- Use sparse data structures for sparse grids with lots of
missing data, for example store only array indices and valid values.
Other sparse formats such as CSR (Compressed Sparse Row) may be useful.
For netCDF-4 formats:
-
Use compression on large variables. Test compression parameters and
effectiveness with nccopy.
- Consider chunking to improve compression, by choosing chunks
with similar data values.
2012 Unidata NetCDF Workshop > Formats and Performance