2012 Unidata NetCDF Workshop > Formats and Performance
14.6 NetCDF-4 Performance Tips
Using the netCDF-4 formats can provide performance benefits.
Avoid premature optimization: worry about
performance only after you determine where the bottlenecks
are. The HDF5 file format is mature and implementation is efficient for
many uses.
-
A chunk is a fixed-size
multidimensional block of data treated as atomic for disk accesses:
disk I/O is always in terms of complete chunks. For netCDF-4 variables,
chunking should be tailored to make common data access patterns efficient
-
Compression (also called "deflation") can save space and either speed up access significantly
or slow access to a crawl, depending on details of use, such as
chunk cache configuration
- "Endianness" can be specified to avoid byte-swapping during I/O
- Parallel I/O can take advantage of efficiencies of parallel file systems.
-
Use of
netCDF-4 classic model format can exploit all the performance
features above without
requiring changes to programs that read the data.
Other netCDF-4 performance features not in netCDF-4 classic
model:
- Compound data types: may speed up accessing structured data,
because it's stored close together
2012 Unidata NetCDF Workshop > Formats and Performance