As part of the latest netcdf-java 2 library, I am working on handling
scale/offset and missing data attributes in a "standard" way. While the
netcdf manual has recommended standards, these are not always followed,
and I would like to know where the implementation rules below would fail
on existing datasets.
For example, in practice, valid_range seems to be in unpacked units
rather than packed. The manual is not that clear (to me) and I could
imagine it being used both ways.
---------------------------
public class VariableStandardized extends Variable
A "standardized" read-only Variable which implements:
1) packed data using scale_factor and add_offset
2) invalid data using valid_min, valid_max, valid_range, missing_data
or _FillValue
if those "standard attributes" are present. If they are not present, it
acts just like the original Variable.
Implementation rules for scale/offset:
1) If scale_factor and/or add_offset variable attributes are present,
then this is a "packed" Variable.
2) the Variable element type is converted to double, unless the
scale_factor and add_offset variable attributes are both type float ,in
which case it converts it to float .
3) packed data is converted to unpacked data transparently during the
read() call.
Implementation rules for missing data:
1) if valid_range is present, valid_min and valid_max attributes are
ignored. Otherwise, the valid_min and/or valid_max is used to construct
a valid range.
2) a missing_value attribute may also specify a scalar or vector of
missing values.
3) if there is no missing_value attribute, the _FillValue attribute
can be used to specify a scalar missing value.
Implementation rules for missing data with scale/offset:
1) valid_range is always in the units of the converted (unpacked) data.
2) _FillValue and missing_data values are always in the units of the
raw (packed) data.
If hasMissingData(), then isMissingData( double val) is called to
determine if the data is missing. Note that the data is converted and
compared as a double.