John and others interested in extending netCDF coordinate conventions,
> I want to add a few comments to the latest proposals about coordinate
> systems.
>
> In general, I agree with Walker and Waring's post
>
> http://www.unidata.ucar.edu/packages/netcdf/coords/0058.html
>
> In particular their comment about multidimensional coordinate variables
> (as proposed by me and others) that:
>
> "From a purely mathematical (and esthetic) point of view, we also find
> the implied statement that d1, for example, depends on things other than
> d1, is confusing and illogical. There is a real temptation here to
> confuse the role of data dimensions and coordinates."
I'll yield to the temptation, because I think it's useful to extend the
notion of a coordinate to serve as "the value associated with a data
dimension index" in a way that preserves a crucial property of
coordinates: a coordinate value uniquely determines the index of its
associated dimension. I think it would be desirable to extend "value"
in the above definition to include text strings or tuples with a fixed
number of components.
For example, a "time" dimension might have three associated coordinate
variables:
dimensions:
time = UNLIMITED;
...
variables:
int year(time);
int day_of_year(time);
float second_of_day(time);
... [many other variables that use the time dimension]
where a (year, day_of_year, second_of_day) tuple uniquely determines the
time dimension index. In this case, there is no single coordinate
variable corresponding to the time dimension, but instead, a 3-tuple
that serves as the time coordinate. This relation could be represented
using any one of several proposed conventions:
1. A scalar "time" variable with a "coordinates" attribute:
variables:
...
int time; // a scalar on which to hang attributes for time dimension
time:coordinates = "year day_of_year second_of_day";
2. A global "dimension attribute":
:time = "year day_of_year second_of_day";
3. A multidimensional time coordinate variable, but now all its values
have to be of the same type, e.g. float:
variables:
...
float time(time,3);
time:components = "year day_of_year second_of_day"
// no good way to include units for components
I know the above example doesn't follow any approved conventions for
handling time, and I don't want to start another discussion of how to
handle time. It's just an example to illustrate what might be possible
with a more general notion of coordinates.
> I have come to the same conclusion by a different route, namely by
> considering coordinate systems in a formal and general way. And so I
> currently am leaning towards leaving coordinate vars 1-dimensional, and
> explicitly specifying coordinate systems in a named attribute.
A named attribute is fine with me; my suggestion for global dimension
attributes with the same names as dimensions has won no other
supporters, so I'm willing to withdraw it.
> I plan to post soon some formal definitions that I hope will be useful,
> also various examples that would be useful for a convention to cover,
> and that I hope others will contribute to.
>
> One part of their example I disagree with is lumping all of the
> coordinate functions together:
> double salt(n,k,j,i);
> salt:long_name = "Salinity";
> salt:units = "1";
> salt:coordinates = "t cell_z cell_y cell_x cell_lat
> cell_lon";
>
> While there's nothing illegal about it, better is:
> salt:coordinates_xy = "t cell_z cell_y cell_x";
> salt:coordinates_latlon = "t cell_z cell_lat cell_lon";
> emphasizing that you are specifying two coordinate systems.
I agree that coordinates should be "factored out", where possible.
Incidentally, we've discovered (and implemented in the next minor
version) extending names of netCDF variables, attributes, and dimensions
to include internal dots and dashes (`.' and `-') as well as
underscores (`_') and ncgen will properly interpret the resulting CDL.
So feel free to use names such as the following if it helps make things
clearer:
coordinates.lat.lon
coordinates-lat-lon
coordinates.x-y
coordinates-lat.lon
Since dots haven't been allowed before, this will permit introducing
new conventions that are guaranteed not to clash with any existing
conventions, and may support a new hierarchical naming scheme.
> The proposal by Gregory, Drach and Tett (GDT) is an ambitious effort to
> specify quite a lot of new meaning. It's really too big to chew on all
> at once, but it will be useful to refer back to, and take the issues one
> at a time.
>
> Since I've been thinking a lot about coordinate systems that should be
> useful for any netcdf use, I would like to complete that discussion
> first, then use whatever agreements we might make as a foundation for
> the "higher level" semantics needed for climate data. Of course,
> keeping in mind specific examples from our specialties is of absolute
> importance as we try to come up with a general, abstract solution. Thats
> why I want to construct a list of concrete examples to compare any
> proposal against.
>
> In that context I will make a few comment about the GDT proposal as it
> intersects our coordinate system discussion. The numbers refer to their
> section numbers at http://www-pcmdi.llnl.gov/drach/netCDF.html
>
> 8. The notion of "axes", though intuitive, is too imprecise for me. As I
> mentioned above, I will post my own attempt at precision in a few days.
> Requiring a variable's dimensions to be all different solves some
> difficult problems, but I havent yet decided if its necessary.
I think you should not require a variable's dimensions to be all
different, because of the example of an auto-correlation matrix which
has two identical dimensions.
> [other comments on GDT proposal omitted] ...
--Russ
_____________________________________________________________________
Russ Rew UCAR Unidata Program
russ@xxxxxxxxxxxxxxxx http://www.unidata.ucar.edu