John and others interested in extending netCDF coordinate conventions, > I want to add a few comments to the latest proposals about coordinate > systems. > > In general, I agree with Walker and Waring's post > > http://www.unidata.ucar.edu/software/netcdf/coords/0058.html > > In particular their comment about multidimensional coordinate variables > (as proposed by me and others) that: > > "From a purely mathematical (and esthetic) point of view, we also find > the implied statement that d1, for example, depends on things other than > d1, is confusing and illogical. There is a real temptation here to > confuse the role of data dimensions and coordinates." I'll yield to the temptation, because I think it's useful to extend the notion of a coordinate to serve as "the value associated with a data dimension index" in a way that preserves a crucial property of coordinates: a coordinate value uniquely determines the index of its associated dimension. I think it would be desirable to extend "value" in the above definition to include text strings or tuples with a fixed number of components. For example, a "time" dimension might have three associated coordinate variables: dimensions: time = UNLIMITED; ... variables: int year(time); int day_of_year(time); float second_of_day(time); ... [many other variables that use the time dimension] where a (year, day_of_year, second_of_day) tuple uniquely determines the time dimension index. In this case, there is no single coordinate variable corresponding to the time dimension, but instead, a 3-tuple that serves as the time coordinate. This relation could be represented using any one of several proposed conventions: 1. A scalar "time" variable with a "coordinates" attribute: variables: ... int time; // a scalar on which to hang attributes for time dimension time:coordinates = "year day_of_year second_of_day"; 2. A global "dimension attribute": :time = "year day_of_year second_of_day"; 3. A multidimensional time coordinate variable, but now all its values have to be of the same type, e.g. float: variables: ... float time(time,3); time:components = "year day_of_year second_of_day" // no good way to include units for components I know the above example doesn't follow any approved conventions for handling time, and I don't want to start another discussion of how to handle time. It's just an example to illustrate what might be possible with a more general notion of coordinates. > I have come to the same conclusion by a different route, namely by > considering coordinate systems in a formal and general way. And so I > currently am leaning towards leaving coordinate vars 1-dimensional, and > explicitly specifying coordinate systems in a named attribute. A named attribute is fine with me; my suggestion for global dimension attributes with the same names as dimensions has won no other supporters, so I'm willing to withdraw it. > I plan to post soon some formal definitions that I hope will be useful, > also various examples that would be useful for a convention to cover, > and that I hope others will contribute to. > > One part of their example I disagree with is lumping all of the > coordinate functions together: > double salt(n,k,j,i); > salt:long_name = "Salinity"; > salt:units = "1"; > salt:coordinates = "t cell_z cell_y cell_x cell_lat > cell_lon"; > > While there's nothing illegal about it, better is: > salt:coordinates_xy = "t cell_z cell_y cell_x"; > salt:coordinates_latlon = "t cell_z cell_lat cell_lon"; > emphasizing that you are specifying two coordinate systems. I agree that coordinates should be "factored out", where possible. Incidentally, we've discovered (and implemented in the next minor version) extending names of netCDF variables, attributes, and dimensions to include internal dots and dashes (`.' and `-') as well as underscores (`_') and ncgen will properly interpret the resulting CDL. So feel free to use names such as the following if it helps make things clearer: coordinates.lat.lon coordinates-lat-lon coordinates.x-y coordinates-lat.lon Since dots haven't been allowed before, this will permit introducing new conventions that are guaranteed not to clash with any existing conventions, and may support a new hierarchical naming scheme. > The proposal by Gregory, Drach and Tett (GDT) is an ambitious effort to > specify quite a lot of new meaning. It's really too big to chew on all > at once, but it will be useful to refer back to, and take the issues one > at a time. > > Since I've been thinking a lot about coordinate systems that should be > useful for any netcdf use, I would like to complete that discussion > first, then use whatever agreements we might make as a foundation for > the "higher level" semantics needed for climate data. Of course, > keeping in mind specific examples from our specialties is of absolute > importance as we try to come up with a general, abstract solution. Thats > why I want to construct a list of concrete examples to compare any > proposal against. > > In that context I will make a few comment about the GDT proposal as it > intersects our coordinate system discussion. The numbers refer to their > section numbers at http://www-pcmdi.llnl.gov/drach/netCDF.html > > 8. The notion of "axes", though intuitive, is too imprecise for me. As I > mentioned above, I will post my own attempt at precision in a few days. > Requiring a variable's dimensions to be all different solves some > difficult problems, but I havent yet decided if its necessary. I think you should not require a variable's dimensions to be all different, because of the example of an auto-correlation matrix which has two identical dimensions. > [other comments on GDT proposal omitted] ... --Russ _____________________________________________________________________ Russ Rew UCAR Unidata Program russ@unidata.ucar.edu http://www.unidata.ucar.edu