Version 5.1.1 of the netCDF Operators (NCO) has been released. NCO is an Open Source package that consists of a dozen standalone, command-line programs that take netCDF files as input, then operate (e.g., derive new data, average, print, hyperslab, manipulate metadata) and output the results to screen or files in text, binary, or netCDF formats.
The NCO project is coordinated by Professor Charlie Zender of the Department of Earth System Science, University of California, Irvine. More information about the project, along with binary and source downloads, are available on the SourceForge project page.
From the release message:
Version 5.1.1 add features for NCZarr, regridding, and interpolation.
All operators now support NCZarr I/O and input filenames via stdin.
ncremap
supports two new vertical extrapolation methods, 1D files, and
allows flexible masking based on external fields such sub-gridscale
extent. ncclimo
outputs regional averages.
Numerous minor fixes improve codec support and regridding control.
All users are encouraged to upgrade to this feature-rich release.
New Features
-
All operators now support specifying input files via stdin. This capability was
implemented with NCZarr in mind, though it can also be used with traditional POSIX
files. The
ncap2
,ncks
,ncrename
, and ncatted operators accept one or two filenames as positional arguments. If the input file is provided via stdin, then the output file, if any, must be specified with-o
so the operators know whether to check stdin. Multi-file operators (ncra
,ncrcat
,ncecat
) will continue to identify the last positional argument as the output file unless-o
is used. The best best practice is to use -o fl_out to specify output filenames when stdin is used for input filenames:echo in.nc | ncks echo in.nc | ncks -o out.nc echo "in1.nc in2.nc" | ncbo -o out.nc echo "in1.nc in2.nc" | ncflint -o out.nc
http://nco.sf.net/nco.html#stdin -
All NCO operators support NCZarr I/O. This support is currently
limited to the "
file://
" scheme. Support for the S3 scheme is next. All NCO commands should work as expected independent of the back-end storage format of the I/O. Operators can ingest and output POSIX, Zarr, or a mixture of these two file formats.in_ncz="file://${HOME}/in_zarr4#mode=nczarr,file" in_psx="${HOME}/in_zarr4.nc" out_ncz="file://${HOME}/foo#mode=nczarr,file" out_psx="${HOME}/foo.nc" ncks ${in_ncz} # Print contents of Zarr file ncks -O -v var ${in_psx} ${out_psx} # POSIX input to POSIX output ncks -O -v var ${in_psx} ${out_ncz} # POSIX input to Zarr output ncks -O -v var ${in_ncz} ${out_psx} # Zarr input to POSIX output ncks -O -v var ${in_ncz} ${out_ncz} # Zarr input to Zarr output ncks -O --cmp='gbr|shf|zst' ${in_psx} ${out_ncz} # Quantize/Compress ncks -O --cmp='gbr|shf|zst' ${in_ncz} ${out_ncz} # Quantize/Compress
Commands with Zarr I/O behave mostly as expected. NCO treats Zarr and POSIX files identically once they are "opened" via the netCDF API. Hence the main difference between Zarr and POSIX, from the viewpoint of NCO, is in handling the filenames. By default NCO performs operations in temporary files that it moves to a final destination once the rest of the command succeeds. Supporting Zarr in NCO means applying the correct procedures to create, copy, move/rename, and delete files and directories correctly depending on the backend format.
Many NCO users rely on POSIX filename globbing for multi-file operations, e.g.,'ncra in*.nc out.nc'
. POSIX globbing returns matches in POSIX format (e.g., 'in1.nc in2.nc in3.nc') which lacks the "scheme://
" indicator and the "#mode=..." fragment that the netCDF API needs to open a Zarr store. There is no perfect solution to this.
A partial solution is available by judiciously using NCO's new stdin capabilities for all operators. The procedure relies on using the 'ls
' command (instead of globbing) to identify the desired Zarr stores, and piping the (POSIX-style) results of that through the newly supplied NCO filter-script that will prepend the desired scheme and append the desired fragment to the matched Zarr stores, and pipe those results to the NCO operator:ncra in*.nc out.nc # POSIX input files via globbing ls in*.nc | ncra out.nc # POSIX input files via stdin ls in*.nc | ncz2psx | ncra out.nc # Zarr input via stdin ls in*.nc | ncz2psx --scheme=file --mode=nczarr,file | ncra out.nc
Thanks to Dennis Heimbigner of Unidata for implementing NCZarr.
http://nco.sf.net/nco.html#nczarr -
The
--glb_avg
switch causes the splitter to output global-mean timeseries files. That has been true since 2019. This switch now causes the splitter to output three horizontally spatially averaged timeseries. First is the global average (as before), next is the northern hemisphere average, followed by the southern hemisphere average. The three timeseries are now saved in a two-dimensional (time by region) array with a "region dimension" namedrgn
. Region names are stored in the variable named region_name:ncclimo --split --rgn_avg # Produce regional and global averages ncclimo --split --glb_avg # Same (deprecated switch name)
Thanks to Chris Golaz of LLNL for suggesting this feature.
http://nco.sf.net/nco.html#rgn_avg -
ncremap
has long been able to re-normalize and/or mask-out fields in partially unmapped destination gridcells. The--rnr_thr
option set the threshold value for valid cell coverage. However, the implementation considered only the fraction of each gridcell left unmapped due to explicit missing values (i.e.,_FillValue
). Now the implementation can also mask by the value of a specified sub-gridscale (SGS) variable, e.g.,landfrac
. The--add_fll
switch now sets to_FillValue
any gridcell whosesgs_frc < rnr_thr
. The--add_fll
switch is currently opt-in, except for datasets produced by MPAS and identifed as such by the-P
option. The new--no_add_fll
overrides and turns off any automatic--add_fll
behavior:ncremap ... # No renormalization/masking ncremap --rnr=0.1 ... # Mask cells missing > 10% ncremap --rnr=0.1 --sgs_frc=sgs ... # Mask missing > 10% ncremap --rnr=0.1 --sgs_frc=sgs --add_fll ... # Mask missing > 90% or sgs < 10% ncremap -P mpas... # --add_fll implicit, mask where sgs=0.0 ncremap -P mpas... --no_add_fll # --add_fll explicitly turned-off, no masking ncremap -P mpas... --rnr=0.1 # Mask missing > 90% or sgs < 10% ncremap -P elm... # --add_fll not implicit, no masking
Thanks to Jill Zhang of LLNL for suggesting this capability.
http://nco.sf.net/nco.html#add_fll -
The map checker diagnoses from the global attributes
map_method
,no_conserve
, ornoconserve
(if present) whether the mapping weights are intended to be conservative (as opposed to, e.g., bilinear). Weights deemed non-conservative by design are no longer flagged with dire WARNING messages. Thanks to Mark Taylor of SNL for this suggestion.ncks --chk_map map.nc
http://nco.sf.net/nco.html#chk_map -
ncremap
vertical interpolation supports two new extrapolation methods: linear and zero. Linear extrapolation does exactly what you think: Values outside the input domain are linearly extrapolated from the nearest two values inside the input domain. Invoke this with--vrt_xtr=lnr
or--vrt_xtr=linear
. Zero extrapolation sets values outside the extrapoloation domain to 0.0. Invoke this with--vrt_xtr=zero
.ncremap --vrt_xtr=zero --vrt=vrt.nc in.nc out.nc ncremap --vrt_xtr=linear --vrt=vrt.nc in.nc out.nc ncks --rgr xtr_mth=linear --vrt=vrt.nc in.nc out.nc ncks --rgr xtr_mth=zero --vrt=vrt.nc in.nc out.nc
http://nco.sf.net/nco.html#vrt_xtr - All numerical operators offer robust support for Blosc codecs when linked to netCDF 4.9.1+. This includes Blosc Zstandard, LZ, LZ4, and Zlib. Thanks to Dennis Heimbigner of Unidata for upstream fixes.
Additional details are available in the ChangeLog.