RE: ncdigest V1 #589 (standard handling of scale/offset and missi

To: "'netcdfgroup@xxxxxxxxxxxxxxxx'" <netcdfgroup@xxxxxxxxxxxxxxxx>
Subject: RE: ncdigest V1 #589 (standard handling of scale/offset and missi
From: "Davies, Harvey" <harvey.davies@xxxxxxxxxxxx>
Date: Thu, 26 Apr 2001 10:40:26 +1000

> Date: Mon, 23 Apr 2001 22:03:30 -0600 (MDT)
> From: Don Hooper <hoop@xxxxxxxxxxxx>
> Subject: Re: Wanted: examples of uses of missing data AND scale/offset
> 
> John Caron,
> 
> Well, the following data sets at ftp://ftp.cdc.noaa.gov/Datasets/:
> coads cpc_us_precip interp_OLR kaplan_sst msu ncep ncep.pac.ocean
> ncep.reanalysis noaa_hrc nodc.woa94 recon_reynolds_sst reynolds_sst
> udel.airt.precip
> seem to fit the bill that you describe, although it is definitely
> for others to decided if the constitute "important holdings".  They're
> important to CDC, at least.  ;)  You should be able to find them via http
> protocol, starting at:
>       http://www.cdc.noaa.gov/PublicData/
> They are also available via DODS, using URLs that start with:
>       http://www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/
> Now, when we got started with our conventions, we felt that the
> User's Guide's discussion of conventions was indeed ambiguous regarding
> the type of the valid_range attribute, merely saying it needed to match
> that of its data variable, without mentioning the complication that such
> a variable would have two types (external/internal, AKA packed/unpacked).
> We made the same call that was described earlier in the discussion, namely
> that it should be more of a human readable concept, as our software
> blithely
> used the missing_value attribute (in the external/packed type) to discern
> the validity of a value.  I recall hearing a couple of complaints about
> our typing of the valid_range attribute.  I doubt it prevented anyone from
> making good use of our data, however.  ;)  I hope this is of some use in
> the on-going discussion.
> 
We did use some of this data.  We had to change the valid_range so our
software could read it without treating
everything as missing.

> Date: Tue, 24 Apr 2001 15:41:21 -0600 (MDT)
> From: Brian Eaton <eaton@xxxxxxxxxxxxx>
> Subject: Re: ncdigest V1 #588 (standard handling of scale/offset and missi
> ng data)
> 
> > > There are two cases.
> > > _FillValue is a valid value if it is within a valid range defined by
> > > valid_range, valid_min, valid_max.
> > > If none of these three attributes are defined then it is assumed that
> > > _FillValue is a missing value &
> > > should be used to define one end of the valid range.
> > >
> > > So if you want to use _FillValue to initialize a variable to a valid
> value
> > > 0, then you must define at
> > > least one of valid_range, valid_min, valid_max.
> >
> > this seems reasonable to me. Does anyone have datasets where this
> > algorithm would be wrong?
> 
> A dataset that was written with a _FillValue within a specified valid
> range
> would not have that _FillValue treated as the valid default data value by
> many existing applications, i.e., such a dataset would break most existing
> applications.  It's confusing to have _FillValue take on opposite meanings
> depending on whether or not a valid range is defined.  I'd prefer that
> your
> interface didn't support this interpretation, especially since the UG
> recommends against it.
> 
I do not understand how this would break most existing applications.  For
example, say we have unscaled float data with:
valid_range = -1000.0,  1000.0
_FillValue = 0.0
Surely 0.0 has to be treated as a valid value, doesn't it?  Otherwise there
would be two valid ranges separated by a gap
around 0.0 which would require four comparisons!

But I can see there might be some doubt if one had
valid_max = 1000.0
_FillValue = 0.0
In this case it is not clear whether the valid min should be -infinity or a
number slightly larger than 0.0. The U.G. implies
it should be -infinity.

Follow-Ups:
- RE: ncdigest V1 #589 (standard handling of scale/offset and missi
  - From: Brian Eaton

2001 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: