On Wed, 23 Jul 1997, Jonathan Gregory wrote: > Section 6: Variable names > > > I think there should be a recommendation that names consist of whole words > > unless there is some strong reason to do otherwise. So 'latitude' would be > > preferred to 'lat'. Note that such full-word variable names often obviate > > the need for a 'long_name' attribute. [HD] > > I would be happy with such a recommendation. I do not think it would reduce > the need for long_name, though. The long_name might really be quite detailed, > for instance "volumetric soil moisture content at wilting point" (or this > might be the quantity in GDT). But surely the variable name 'latitude' is adequate! > > I suppose the ability to define 0-dimensional variables could come in handy, > > though such a quantity is probably more appropriately stored as a global > > attribute. [JS] There seems to be some confusion about what is meant by 0-dimensional. I would assume it means rank=0. In other words an ordinary scalar value (i.e. no dimensions). JS seems to mean an array with a dimension of size 0. > > I wish to propose allowing missing (invalid) values in coordinate variables. > > All corresponding data in the main variable would also have to be missing. > > In particular this would simplify the problem of calendar dimensions which > > GDT discuss. You could simply allocate 31 days to every month and set data > > for illegal dates (e.g. 30 Feb) to a missing value. [HD] > > I am not happy about this idea, myself. To me it would imply that the data > existed in principle, but was simply unavailable. See also Section 24. I would argue strongly for a much broader concept of 'missing' or 'invalid'. I see no reason why some of the missing values specified in the missing_value vector should not mean things like 'meaningless' and 'undefined'. This is very similar to having missing values in the ocean for land-only variables like soil-moisture. How else can such values be represented? I would also argue strongly for the above proposal to allow missing (invalid) values in coordinate variables. I feel it is a neat solution to the date problem and is likely to be useful in other contexts. > Section 11: Units > > > I would like to see "none" added as a legitimate characterization, as it > > would serve as a definite affirmation that the variable really does have no > > units. [JS] > > Good idea. Perhaps "one" or "unity" would be acceptable, since this could > perhaps be inserted comfortably into the udunits "constants" section? You ignored my comment that the required functionality is already provided by udunits which allows units=" " for this purpose. If you do not like using blank then udunits also allow units="1". > I'm afraid I do not understand Harvey Davies's "measurement level" proposal. Measurement level (measurement scale) describes the valid operations on a variable and thus determines what statistics are valid. The four levels are: 1. NOMINAL: Only valid operation is '='. A measure of location is the MODE (most frequent value). 2. ORDINAL: Comparisons are possible using operations '<' and '>'. Non-parametric statistics can be used. The usual measure of location is the MEDIAN (value with 50% of cases above & 50% below). 3. INTERVAL: Addition and subtraction are allowed. So the ordinary ARITHMETIC-MEAN can be calculated and most standard statistical techniques can be used. 4. RATIO: Multiplication and division are allowed. So the GEOMETRIC-MEAN can be calculated. Most physical and chemical measurements are at this level. Here are some meteorological examples: 1. NOMINAL: Cloud Type (e.g. 1=cirrus, 2=nimbus, etc.) 2. ORDINAL: Beaufort Wind Scale (from 0=calm to 12=Hurricane). 3. INTERVAL: Temperature in Celsius. 4. RATIO: Temperature in Kelvin. It makes sense to say that 200K is twice the temperature of 100K. I am trying to think of a better example of an INTERVAL variable. The above temperature example is confusing in that it is the unit which makes it INTERVAL, not the nature of the variable itself. Perhaps a better example would be altitude measured relative to an arbitrary datum whose absolute altitude (height above standard sea-level) is unknown. > Section 24: Time axes > > If the unit is a day then there should be a fixed number (31 for 'normal' > > calendars such as Gregorian) days in each month. The time coordinate > > variable should have a missing value for each day which does not exist in the > > calendar used. I think this obviates the need for the 'calendar' global > > attribute and allows for most kinds of calendars without having to hard-code > > them into a standard. [HD] > > This would deal with the particular case of calculating the interval between > two dates when a time axis at daily intervals is provided. I am not sure that > counting the non-missing days between two points in a vector would be more > convenient than working it out using a calendar-dependent algorithm, although > it would be more general, I agree. However, it would not help if you did not > wish to provide time coordinates at daily intervals. What if I have time > coordinates at monthly intervals? To indicate the lengths of the months, would > I have to pad out the coordinate vector, and presumably the data too, with > missing data values at daily intervals i.e. approximately 30 times more missing > data than genuine data? Not only would wasted space be added to the file, but > it could easily be misunderstood, no matter how explicit the convention is > made. I suggest storing monthly data as follows: dimensions: month = 120; variables: length(month); length:units="days"; temperature(month); data: length = 31, 28, 31, 30, 31, ... > Certainly, monthly and yearly mean data are among the most important types of > climate data, so it is crucial to keep the representation of such data as > simple and natural as possible, while representing them. But I think it is > good to avoid units of months and years. Although the udunits unit of "months" > has a precise meaning (30.4368 days), this is probably not what you intend, and > could lead applications to make mistakes if they do not check carefully what > the intention is. I do not see what the problem is with 'year' and 'month' in this context. All that matters is that there are 12 months in a year, a fact with which udunits agrees! > Section 32: Missing values in a data variable > > > I think that the data should be checked against the "missing_value" *before* > unpacking. [JS] > > Yes, you may well be correct. Thanks. Of course you can use missing_value however you like in SPECIFIC applications. But the netCDF User's Guide now states that GENERIC applications should use the valid range (as defined by valid_range or valid_min/max), not missing_value. (I confess that you have me to blame me for this change. You may want to throw something in the direction of Australia, so I am donning my helmet as follows: [(:-) ) Harvey Davies, CSIRO Mathematical and Information Sciences, 723 Swanston Street, Carlton, Victoria 3053, Australia Email: harvey.davies@cmis.csiro.au Phone: +61 3 9282 2623 or +61 3 9239 4556 Fax: +61 3 9282 2600