Version 4.4.8 of the netCDF Operators (NCO) has been released. NCO is an Open Source package that consists of a dozen standalone, command-line programs that take netCDF files as input, then operate (e.g., derive new data, average, print, hyperslab, manipulate metadata) and output the results to screen or files in text, binary, or netCDF formats.
The NCO project is coordinated by Professor Charlie Zender of the Department of Earth System Science, University of California, Irvine. More information about the project, along with binary and source downloads, are available on the SourceForge project page.
From the release message:
NCO now implements a lossy compression feature distinct from the
packing ( scale_factor+add_offset
) that NCO has long supported.
The new feature is activated by specifying desired level of precision
in terms of either the total number of significant digits or the
number of significant digits after (or before) the decimal point.
These precision features are lumped together under the generic name
Precision-Preserving Compression (PPC), summarized below.
Specifying more reasonable and optimized chunking maps has been made easier by the addition of a new "best practices" policy which implements Rew's balanced chunking for three-dimensional variables, and LeFter-Product (lfp) chunking for all others.
New ncwa/ncra/nces
arithmetic operators mabs()
, mebs()
, and mibs()
simplify statistical analysis.
New Features
-
NCO will now store data at a per-variable precision level.
We call this Precision-Preserving Compression (PPC). PPC currently
understands two types of precision. Users can specify either the
total Number of Significant Digits (NSD) or the Decimal Significant
Digits (DSD), meaning the number of significant digits after (or
before) the decimal point. For example, NSD=5 tells NCO to retain 5
significant digits. Specifying DSD=3 or DSD=-2 causes NCO to
preserve the number rounded to the nearest thousandth or hundred,
respectively.
Under the hood, NSD uses bitmasking for quantization, while DSD utilizes rounding. The bitmasking/rounding results in consecutive zero-bits ending the IEEE-754 storage of each floating point number. Standard byte-stream compression techniques, such as the DEFLATE compression used by gzip (and in HDF5), compress these zero-bits more efficiently than unrounded numbers. The net result is PPC makes netCDF files skinnier when compressed. Compression is internal with netCDF4 and external (e.g., gzip or bzip2) with netCDF3. Space savings can be large.
And face it, how often does your precision exceed 3 digits? And don't worry, coordinate variables are not rounded :) An advantage of PPC is that (unlike packing), PPC needs no explicit support in other software because data stays in IEEE format. Thanks to Rich Signell for suggesting DSD compression for NCO.
ncks --ppc default=5 --ppc temperature=3 in.nc out.nc
ncks --ppc AER.?,AOD.?,ARE.?,AW.?,BURDEN.?=3 in.nc out.nc
ncpdq --ppc default=4 --ppc grid_area=15 in.nc out.nc
http://nco.sf.net/nco.html#ppc has extensive documentation. -
New "nco" chunking policy and modified "rew" chunking map:
Policy "nco" is a virtual option that implements the best
(in the subjective opinion of the authors) policy and map
for typical usage. This combination will evolve with time.
As of NCO version 4.4.8, this virtual policy implements
map_rew for 3-D variables and map_lfp for all other variables.
For the time being, map_rew does the same, i.e., it also
calls map_lfp when variables are not 3-D. This ensures that
Rew's balanced chunking is used on variables for which it
applies, and another sensible default (lfp = Lefter Product)
is used on all other variables big enough to chunk.
ncks --cnk_plc=nco in.nc out.nc
ncks --cnk_map=rew in.nc out.nc
http://nco.sf.net/nco.html#cnk -
NCO dimension-reducing operators (ncra, ncwa, nces) now support
three new arithmetic operations to facilitate statistics:
mabs(), mebs(), and mibs(). These compute the maximum, mean, and
minimum absolute value, respectively. They are invoked with the
-y or --op_typ switch in the same manner as max/min/avg:
ncwa -y mabs in.nc out.nc # Maximum absolute value
ncra -y mebs in.nc out.nc # Mean absolute value
nces -y mibs in.nc out.nc # Minimum absolute value
http://nco.sf.net/nco.html#op_typ -
NCO warns when appended output type differs from input type.
Previously NCO would not warn or die when the user (usually
inadvertently) wrote data of one type into a destination meant
for a different type. These commands would therefore complete
without warning:
ncks -C -O -v double_var ~/nco/data/in.nc ~/foo.nc
ncrename -O -v double_var,float_var ~/foo.nc
ncks -C -A -v float_var ~/nco/data/in.nc ~/foo.nc
Now the user is warned though the operation is still permitted.
http://nco.sf.net/nco.html#-A
Additional details are available in the ChangeLog.