Re: standard handling of scale/offset and missing data



Brian Eaton wrote:

Hi John,

The treatment of the valid_range, _FillValue and missing_value attributes
differs substantially between v-2.3 and v-3 of the User's Guide.

In v-2.3 there is no definition of "valid."  It appears that the treatment
of invalid data (i.e., values outside the valid range) is left to domain
specific applications.  In v-3 invalid data is defined to be missing.  This
allows generic applications to treat invalid data in the same way that the
_FillValue is treated.

In v-2.3 the _FillValue is not connected to the definition of valid_range
except that it is recommended that the _FillValue should be outside the
valid_range.  In v-3 if a valid range is not defined then _FillValue is
used to define one.

In v-2.3 missing values were specified by both the missing_value and
_FillValue attributes.  The only disctiction was that _FillValue was used
by the library for pre-filling.  In v-3 values specified by missing_value
are "not treated in any special way by the library or conforming generic
applications".  This is a rather subtle way of saying the use of
missing_value is deprecated.  I didn't pick up on this until Harvey
clarified that intention in his previous email.

My experience indicates that most generic applications implement the v-2.3
specification rather than v-3.  But I think that v-3 is the more precise
specification that removes ambiguity and redundancy from the v-2.3 spec.  I
would recommend that your java interface implement the v-3 spec.

One other comment:


Implementation rules for scale/offset:
   :
   2) the Variable element type is converted to double, unless the
scale_factor and add_offset variable attributes are both type float ,in
which case it converts it to float .


The type of scale_factor and add_offset should determine the unpacked
type.  What if I want to unpack bytes into ints?

Yes, that would be a more general way to do it. I was just fishing to see if anyone objected to the easier way :^}

One question to you as a writer of NetCDF files is whether any of these rules would/would not work for reading your files with VariableStandardized ? Just to get personal, GDV (the java version of vmd) will use VariableStandardized and I'd like it to handle as wide a range of datasets as possible.




  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: