> Date: Mon, 23 Apr 2001 22:03:30 -0600 (MDT)
> From: Don Hooper <hoop@xxxxxxxxxxxx>
> Subject: Re: Wanted: examples of uses of missing data AND scale/offset
>
> John Caron,
>
> Well, the following data sets at ftp://ftp.cdc.noaa.gov/Datasets/:
> coads cpc_us_precip interp_OLR kaplan_sst msu ncep ncep.pac.ocean
> ncep.reanalysis noaa_hrc nodc.woa94 recon_reynolds_sst reynolds_sst
> udel.airt.precip
> seem to fit the bill that you describe, although it is definitely
> for others to decided if the constitute "important holdings". They're
> important to CDC, at least. ;) You should be able to find them via http
> protocol, starting at:
> http://www.cdc.noaa.gov/PublicData/
> They are also available via DODS, using URLs that start with:
> http://www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/
> Now, when we got started with our conventions, we felt that the
> User's Guide's discussion of conventions was indeed ambiguous regarding
> the type of the valid_range attribute, merely saying it needed to match
> that of its data variable, without mentioning the complication that such
> a variable would have two types (external/internal, AKA packed/unpacked).
> We made the same call that was described earlier in the discussion, namely
> that it should be more of a human readable concept, as our software
> blithely
> used the missing_value attribute (in the external/packed type) to discern
> the validity of a value. I recall hearing a couple of complaints about
> our typing of the valid_range attribute. I doubt it prevented anyone from
> making good use of our data, however. ;) I hope this is of some use in
> the on-going discussion.
>
We did use some of this data. We had to change the valid_range so our
software could read it without treating
everything as missing.
> Date: Tue, 24 Apr 2001 15:41:21 -0600 (MDT)
> From: Brian Eaton <eaton@xxxxxxxxxxxxx>
> Subject: Re: ncdigest V1 #588 (standard handling of scale/offset and missi
> ng data)
>
> > > There are two cases.
> > > _FillValue is a valid value if it is within a valid range defined by
> > > valid_range, valid_min, valid_max.
> > > If none of these three attributes are defined then it is assumed that
> > > _FillValue is a missing value &
> > > should be used to define one end of the valid range.
> > >
> > > So if you want to use _FillValue to initialize a variable to a valid
> value
> > > 0, then you must define at
> > > least one of valid_range, valid_min, valid_max.
> >
> > this seems reasonable to me. Does anyone have datasets where this
> > algorithm would be wrong?
>
> A dataset that was written with a _FillValue within a specified valid
> range
> would not have that _FillValue treated as the valid default data value by
> many existing applications, i.e., such a dataset would break most existing
> applications. It's confusing to have _FillValue take on opposite meanings
> depending on whether or not a valid range is defined. I'd prefer that
> your
> interface didn't support this interpretation, especially since the UG
> recommends against it.
>
I do not understand how this would break most existing applications. For
example, say we have unscaled float data with:
valid_range = -1000.0, 1000.0
_FillValue = 0.0
Surely 0.0 has to be treated as a valid value, doesn't it? Otherwise there
would be two valid ranges separated by a gap
around 0.0 which would require four comparisons!
But I can see there might be some doubt if one had
valid_max = 1000.0
_FillValue = 0.0
In this case it is not clear whether the valid min should be -infinity or a
number slightly larger than 0.0. The U.G. implies
it should be -infinity.