[netcdfgroup] clarification on _FillValue, missing_value, valid_xxx

To: NetCDF Group List <netcdfgroup@xxxxxxxxxxxxxxxx>
Subject: [netcdfgroup] clarification on _FillValue, missing_value, valid_xxx
From: Mary Haley <haley@xxxxxxxx>
Date: Thu, 25 Jul 2013 13:54:14 -0600

Hi guys,

Great class yesterday. I hope to fill out the survey soon.

To followup on my misunderstanding of _FillValue versus missing_value versus
valid_xxxxx: do you think that software packages should be upgraded to handle
all these possible attributes as you describe on ths page:

http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Attribute-Conventions.html

It wasn't clear to me, for example, if netcdf4-python handles all cases of
these attributes?

I looked for the latest CF document and found this:

http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/cf-conventions.html

If you do a browser search for "missing_value", it has some text in section
2.5.1 that looks like it was being edited, but never resolved. It says, in
some now crossed-out text, that missing_value is to be deprecated:

------------------------------------------------------------------------------------------------------------------------------------------------

http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.6/cf-conventions.html#missing-data

2.5.1. Missing Data

The NUG conventions ( NUG section 8.1 NUG section 8.1 ) provide the _FillValue,
missing_value, valid_min, valid_max, and valid_range attributes to indicate
missing data.

The NUG conventions for missing data changed significantly between version 2.3
and version 2.4. Since version 2.4 the NUG defines missing data as all values
outside of the valid_range, and specifies how the valid_range should be defined
from the _FillValue (which has library specified default values) if it hasn't
been explicitly specified. If only one missing value is needed for a variable
then we recommend strongly that this value be specified using the _FillValue
attribute. Doing this guarantees that the missing value will be recognized by
generic applications that follow either the before or after version 2.4
conventions.

The scalar attribute with the name _FillValue and of the same type as its
variable is recognized by the netCDF library as the value used to pre-fill disk
space allocated to the variable. This value is considered to be a special value
that indicates undefined or missing data, and is returned when reading values
that were not written. The _FillValue should be outside the range specified by
valid_range (if used) for a variable. The netCDF library defines a default fill
value for each data type ( NUG section 7.16 NUG section 7.16 ).

The missing_value attribute is considered deprecated by the NUG and we do not
recommend its use. However for backwards compatibility with COARDS this
standard continues to recognize the use of the missing_value attribute to
indicate undefined or missing data.

The missing values of a variable with scale_factor and/or add_offset attributes
(see section Section 8.1, “Packed Data”) are interpreted relative to the
variable's external values , i.e., the values stored in the netCDF file.
(a.k.a. the packed values, the raw values, the values stored in the netCDF
file), not the values that result after the scale and offset are applied.
Applications that process variables that have attributes to indicate both a
transformation (via a scale and/or offset) and missing values should first
check that a data value is valid, and then apply the transformation. Note that
values that are identified as missing should not be transformed. Since the
missing value is outside the valid range it is possible that applying a
transformation to it could result in an invalid operation. For example, the
default _FillValue is very close to the maximum representable value of IEEE
single precision floats, and multiplying it by 100 produces an "Infinity"
(using single precision arithmetic).

------------------------------------------------------------------------------------------------------------------------------------------------

In the COARDS document:

http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html

it states:

------------------------------------------------------------------------------------------------------------------------------------------------

• _FillValue - If a scalar attribute with this name is defined for a variable
and is of the same type as the variable, it will be subsequently used as the
fill value for that variable. The purpose of this attribute is to save the
applications programmer the work of prefilling the data and also to eliminate
the duplicate writes that result from netCDF filling in missing data with its
default fill value, only to be immediately overwritten by the programmer's
preferred value. This value is considered to be a special value that indicates
missing data, and is returned when reading values that were not written. The
missing value should be outside the range specified by valid_range (if used)
for a variable. It is not necessary to define your own _FillValue attribute for
a variable if the default fill value for the type of the variable is adequate.

• missing_value - missing_value is a conventional name for a missing value that
will not be treated in any special way by the library, as the _FillValue
attribute is. It is also useful when it is necessary to distinguish between two
kinds of missing values. The netCDF data type of the missing_value attribute
should match the netCDF data type of the data variable that it describes. In
cases where the data variable is packed via the scale_value attribute this
implies that the missing_value flag is likewise packed. The same holds for the
_FillValue attribute. The NOAA cooperative standard does not endorse any
particular interpretation of the distinction between missing_value and
_FillValue.

------------------------------------------------------------------------------------------------------------------------------------------------

Anyway, I just want to make sure that NCL is doing the proper thing with regard
to missing_value, especially since I didn't realize that it could be an array
of values, and not just a scalar.

Thanks,

--Mary

2013 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: