[netcdfgroup] netCDF Operators NCO version 4.4.0 are ready

The netCDF Operators NCO version 4.4.0 are ready.

http://nco.sf.net (Homepage)
http://dust.ess.uci.edu/nco (Homepage "mirror")

This release focuses on stability and speed.
It also addresses previous omissions so that full path names are
accepted by all appropriate options, such as the ncwa weights and
masks (-w and -m), and ncks auxiliary coordinates (-X).
See below for bugfixes include ncra and ncrcat use-cases involving
strides with superfluous files and/or multi-record output (MRO).

Other significant new features include major improvements to
conversion of HDF4, HDF5, and netCDF4 files to netCDF3 and netCDF4
files. Most of this work is an offshoot of writing an NCO and
CFchecker-based solution to the problem of checking CF-compliance
for of datasets in HDF4, HDF5, and netCDF4 formats. The solution
is a script called ncdismember that now works well with most
NASA stewarded datasets I've thrown at it.

We postponed the name-change of ncra and ncwa to ncrs and ncws,
respectively, until the next version for stability.
Work on NCO 4.4.1 is underway, focused on stability and speed.
There will be more netCDF4 mop-up (-X and --cnk) and, possibly,
improved HDF4 support, and cache manipulation for chunking.

Enjoy,
Charlie

"New stuff" in 4.4.0 summary (full details always in ChangeLog):

NEW FEATURES:

A. ncrename allows full group pathnames for new_name arguments.
   Previously, ncrename allowed full group pathnames only for old_name
   and the syntax was
   ncrename -v /path/to/old_name,new_name in.nc out.nc
   and new_name was presumed to be on the same path as old_name.
   The new feature means this now works:
   ncrename -v /path/to/old_name,/path/to/new_name in.nc out.nc
   This embodies no new functionality because the paths must be
   identical! In other words, /path/to must lead to the same
   location. If it did not, you would be both _moving_ and _renaming_
   the variable, not just renaming it.
   _moving_ groups and variables is more arduous then renaming.
   For more on renaming and moving, see
   http://nco.sf.net/nco.html#rename
   http://nco.sf.net/nco.html#move

A. Speed-up due to removing new static arrays from codebase.
   Though not immediately visible to the user, memory access patterns
   play a large role in determining NCO speed. Many static arrays were
   introduced in developing NCO group-enabled features over the past
   year. Most of these have been converted to dynamic arrays now.
   This significantly accelerates NCO speed for many use cases,
   including many netCDF3 cases (because NCO uses one code base for
   all filetypes). If you've noticed and lamented NCO becoming more
   sluggish over the past year, you may be pleasantly surprised
   by NCO's new zippiness.

B. ncrename behavior. The underlying netCDF4/HDF5 library on which NCO
   depends has important features and bugfixes in netCDF-4.3.1-rc5,
   now available. Users who build NCO on that version or later gain
   access to group renaming, and to fixes in renaming coordinates.
   These features and fixes are described here:
   http://nco.sf.net/nco.html#ncrename_crd
   http://nco.sf.net/nco.html#ncrename

C. ncks now accepts the "all" argument to the --fix_rec_dmn option.
   ncks --fix_rec_dmn=all
   converts all output record dimensions to fixed dimensions.
   Previously, --fix_rec_dmn only accepted the name of the single
   record dimension to be fixed.
   Now it is simple to fix all record dimensions simultaneously.
   This is useful (and nearly mandatory) when flattening netCDF4
   files that have multiple record dimensions per group into netCDF3
   files (which may have at most one record dimension).
   ncks --fix_rec_dmn=all in.nc out.nc
   ncks -G : -3 --fix_rec_dmn=all in.nc out.nc
   http://nco.sf.net/nco.html#fix_rec_dmn
   http://nco.sf.net/nco.html#autocnv

D. HDF4 behavior: Thanks to recent improvements in netCDF,
   NCO more gracefully handles HDF4 files. When compiled with netCDF
   version 4.3.1-rc7 (20131222) or later, NCO no longer needs the
   --hdf4 switch. NCO uses netCDF to determine automatically whether
   the underlying file is HDF4, then takes appropriate precautions to
   avoid calls not supported by the netCDF4 subset of HDF4.
   ncks fl.hdf
   ncks fl.hdf fl.nc
   http://nco.sf.net/nco.html#hdf4

E. NCO autoconverts HDF4 and HDF5 atomic-types (e.g., NC_UBYTE,
   NC_STRING) to netCDF3 atomic types (e.g., NC_SHORT, NC_CHAR) when
   necessary, i.e., when the output file is netCDF3.
   ncks -3 fl.hdf fl.nc
   http://nco.sf.net/nco.html#autocnv

F. ncdismember flattens all groups in a file, not only leaf groups.
   Previously ncdismember disaggregated only leaf groups.
   Hierarchical files may contain data and/or metadata at all levels.
   The new behavior disaggregates all groups with data/metadata.
   ncdismember is especially useful for checking CF-compliance using
   the separately installed 'cfchecker' utility.
   Usage:
   ncdismember ~/nco/data/dsm.nc ${DATA}/nco/tmp cf 1.5
   ncdismember automatically appends the CF Conventions attribute to
   all disaggregated files that do not already contain it.
   This considerably reduces CF Warning and Error counts.
   http://nco.sf.net/nco.html#ncdismember

G. ncdismember just plain works in most real world cases.
   Taken together, NCO's new features (autoconversion to netCDF3
   atomic types, fixing multiple record dimensions, autosensing
   HDF4 input) and bugfixes (allowing whitespace in group and
   filenames, scoping rules for CF conventions) make ncdismember
   more reliable and friendly for both dismembering files and for
   CF-compliance checks. Now most HDF4 and HDF5 datasets can be
   checked for CF-compliance with a one-line command.
   Example compliance checks of common NASA datasets are at
   http://dust.ess.uci.edu/diwg/*.txt
   http://nco.sf.net/nco.html#ncdismember
   http://nco.sf.net/nco.html#autocnv

H. ncks now prints hidden (aka special) attributes when given the
   --hdn or --hidden option. This is equivalent to ncdump -s.
   Hidden attributes include: _Format, _DeflateLevel, _Shuffle,
   _Storage, _ChunkSizes, _Endianness, _Fletcher32, and _NOFILL.
   Previously ncks ignored all these attributes in CDL/XML modes.
   Now it prints these attributes as appropriate.
   http://nco.sf.net/nco.html#hdn

I. ncwa weight and mask (-w and -m) arguments may now be full path
   names to variables nested within a group hierarchy.
   ncwa -a lev -w /g8/lev_wgt in.nc out.nc
   ncwa -a lev -w /g8/lev_msk in.nc out.nc
   http://nco.sf.net/nco.html#ncwa

J. The --cnk_byt option was introduced to allow users to manually
   specify the total desired chunksize (in Bytes). In the absence
   of this parameter, NCO sets the chunksize to the filesystem
   blocksize of the output file (if obtainable via stat()), or else
   to 4096 B, the Linux default blocksize.
   ncks -4 --cnk_byt=8192 in.nc out.nc
   Note that --cnk_dmn arguments are still in elements, not bytes.
   Should we use bytes instead of elements for all chunk arguments?
   Send us your preference to help the decisions for 4.4.1.
   http://nco.sf.net/nco.html#cnk

K. New Chunking policies (xst) and maps (xst, lfp)
   These stand for "Existing" and "LeFter Product", respectively.
   The new options allow NCO to retain existing chunking sizes, and/or
   to use lfp map (suggested by Chris Barker) in many situations.
   ncks -4 --cnk_plc=xst --cnk_map=lfp in.nc out.nc
   http://nco.sf.net/nco.html#cnk

BUG FIXES:

A. De-compressing netCDF4 files/variables by specifying deflation
   level=0 works. This fixes a bug where previously NCO could set the
   deflation level of any variable to any level except zero.

B. Dimensions in hyperslabbing arguments are once again checked for
   validity prior to processing and invalid (i.e., non-existent)
   dimensions once-again cause operators to abort.

C. Fix one-line diagnostic bug that caused many OpenMP-enabled
   operators to die when dbg_lvl > 2.

D. Fix ncra/ncrcat bug where extra record used when superfluous input
   files provided and stride places first index of superfluous files
   beyond user-specified last index. An "important corner-case".
   Problem reported by John.

E. Fix ncra/ncrcat bug where no more files were read after all desired
   records of the first record dimension were obtained (i.e., in cases
   where multiple record dimensions exist in multiple files).

F. Versions 4.3.6--4.3.9 of ncra could treat missing values
   incorrectly during double-precision arithmetic. A symptom was that
   missing values could be replaced by strange numbers like, well,
   infinity or zero. This mainly affects ncra in MRO (multi-record
   ouput) mode, and the symptoms should be noticeable.
   The workaround is to run the affected versions of ncra using the
   --flt switch, so that single-precision floating point numbers are
   not promoted. The solution is to upgrade to NCO 4.4.0.
   Problem reported by Andrew Friedman.
   http://nco.sf.net#bug_ncra_mss_val

G. Versions through 4.3.9 would not always copy/print groups that
   contain _only_ metadata (i.e., contain no variables). Fixed.

H. Sometimes the "coordinates" and "bounds" CF attributes caused
   incorrect matches to out-of-scope variables in hierarchical files.
   Fixed.

I. NCO correctly handles output filenames that contain whitespace.
   Previously, NCO would complain when moving the temporary to the
   final output file (the workaround was to use --no_tmp_fl).

J. ncks XML/NcML no longer creates a _FillValue attribute for unsigned
   types. It did so in NCO 4.3.7--4.3.9 because Unidata toolsUI does
   so, but apparently this is a bug not a feature so NCO no longer
   emulates it. Likewise, ncks emits a _ChunkSizes attributes when
   appropriate, not (like toolsUI) a _ChunkSize attribute.

K. Chunking options were not working as intended for some time. Fixed.

KNOWN ISSUES NOT YET FIXED:

   This section of ANNOUNCE reports and reminds users of the
   existence and severity of known, not yet fixed, problems.
   These problems occur with NCO 4.4.0 built/tested with netCDF
   4.3.1-rc7 snapshot 20131222 on top of HDF5 hdf5-1.8.9 with these
methods:

   cd ~/nco;./configure --enable-netcdf4  # Configure mechanism -or-
   cd ~/nco/bld;make dir;make allinone # Old Makefile mechanism

A. NOT YET FIXED (would require DAP protocol change?)
   Unable to retrieve contents of variables including period '.' in name
   Periods are legal characters in netCDF variable names.
   Metadata are returned successfully, data are not.
   DAP non-transparency: Works locally, fails through DAP server.

   Demonstration:
   ncks -O -C -D 3 -v var_nm.dot -p
http://thredds-test.ucar.edu/thredds/dodsC/testdods in.nc # Fails to
find variable

   20130724: Verified problem still exists.
   Stopped testing because inclusion of var_nm.dot broke all test scripts.
   NB: Hard to fix since DAP interprets '.' as structure delimiter in
HTTP query string.

   Bug report filed: https://www.unidata.ucar.edu/jira/browse/NCF-47

B. NOT YET FIXED (would require DAP protocol change)
   Correctly read scalar characters over DAP.
   DAP non-transparency: Works locally, fails through DAP server.
   Problem, IMHO, is with DAP definition/protocol

   Demonstration:
   ncks -O -D 1 -H -C -m --md5_dgs -v md5_a -p
http://thredds-test.ucar.edu/thredds/dodsC/testdods in.nc

   20120801: Verified problem still exists
   Bug report not filed
   Cause: DAP translates scalar characters into 64-element (this
   dimension is user-configurable, but still...), NUL-terminated
   strings so MD5 agreement fails

C. NOT YET FIXED (NCO problem)
   Correctly read arrays of NC_STRING with embedded delimiters in
ncatted arguments

   Demonstration:
   ncatted -D 5 -O -a
new_string_att,att_var,c,sng,"list","of","str,ings" ~/nco/data/in_4.nc
~/foo.nc
   ncks -m -C -v att_var ~/foo.nc

   20130724: Verified problem still exists
   TODO nco1102
   Cause: NCO parsing of ncatted arguments is not sophisticated
   enough to handle arrays of NC_STRINGS with embedded delimiters.

D. NOT YET FIXED (netCDF library problem)
   Probe hidden attributes (chunking, compression) of HDF4 files

   Demonstration:
   ncdump -h -s ~/nco/data/hdf.hdf # (dies)
   ncks -m ~/nco/data/hdf.hdf # (works by avoiding fatal calls)

   20131230: Verified problem still exists

   Cause: some libnetCDF library functions fail on HDF4 file inquiries.
   Bug report filed: netCDF #HZY-708311 ncdump/netCDF4 segfaults probing
HDF4 file
   Tracking tickets NCF-272, NCF-273

E. [FIXED in netCDF 4.3.1-rc5 ... please upgrade]
   netCDF4 library fails when renaming dimension and variable using
   that dimension, in either order. Works fine with netCDF3.
   Also library causes var rename to imply dimension rename, and visa versa.
   Hence coordinate renaming does not work with netCDF4 files.
   Problem with netCDF4 library implementation.

   Demonstration:
   ncks -O -4 -v lat_T42 ~/nco/data/in.nc ~/foo.nc
   ncrename -O -D 2 -d lat_T42,lat -v lat_T42,lat ~/foo.nc ~/foo2.nc #
Breaks with "NetCDF: HDF error"
   ncks -m ~/foo.nc

   20130724: FIXED in netCDF 4.3.1-rc5 in 201212. Will be in netCDF 4.3.1.
   Bug report filed: netCDF #YQN-334036: problem renaming dimension and
coordinate in netCDF4 file
   Workaround: Use ncrename twice; first rename the variable, then
rename the dimension.
   More Info: http://nco.sf.net/nco.html#ncrename_crd

"Sticky" reminders:

A. Pre-built, up-to-date Debian Sid & Ubuntu packages:
   http://nco.sf.net#debian

B. Pre-built Fedora and CentOS RPMs:
   http://nco.sf.net#rpm

C. Pre-built Windows (native) and Cygwin binaries:
   http://nco.sf.net#windows

D. Pre-built AIX binaries:
   http://nco.sf.net#aix

E. Did you try SWAMP (Script Workflow Analysis for MultiProcessing)?
   SWAMP efficiently schedules/executes NCO scripts on remote servers:

   http://swamp.googlecode.com

   SWAMP can work command-line operator analysis scripts besides NCO.
   If you must transfer lots of data from a server to your client
   before you analyze it, then SWAMP will likely speed things up.

F. NCO support for netCDF4 features is tracked at

   http://nco.sf.net/nco.html#nco4

   NCO supports netCDF4 atomic data types, compression, chunking, and
groups.

G. Reminder that NCO works on most HDF4 and HDF5 datasets, e.g.,
   NASA AURA HIRDLS HDF-EOS5
   NASA ICESat GLAS HDF5
   NASA MERRA HDF4
   NASA MODIS HDF4
   NASA SBUV HDF5...

-- 
Charlie Zender, Earth System Sci. & Computer Sci.
University of California, Irvine 949-891-2429 )'(



  • 2014 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: