Hi Stephen- ... > I have attached a postscript plot to which the following discussion refers. > Please let me know if you cannot view it or print it. I was able to view it, no sweat. And it illustrates the problem quite well. Is there some way to do this in ASCII so that others could see it as part of some mail? Hmmm, something like... Y | - * 3,4 | | * 2,4 | 3 - * 3,3 * 1,4 | | * 2,3 | 2 - * 3,2 * 1,3 <=== (j,i) (C-wise) | | * 2,2 | 1 - * 3,1 * 1,2 | | * 2,1 | 1,1 +-----|-----|-----*-----|--- X 1 2 3 4 OR Y | - * | | * | 3 - * * <=== (j,i) | | * | 2 - * * j = 3 | i = 4 | * | 1 - * * | | * | 0 +-----|-----|-----*-----|--- X 0 1 2 3 4 > Take a netCDF fragment as follows: > > dimensions: > i = 10; > j = 5; > > variables: > float temperature(j,i); > temperature:coordinates = "x,y"; > float x(j,i); > float y(j,i); > > Note that I have used our suggested "coordinates" attribute, rather than > Jonathan's "associate" one, but this is unimportant to what follows. > > Looking at my plot, the temperature values are on a grid containing 5 rows > and 10 columns, and are located at the black dots. Each black dot has a location > in (i,j) space, and also a location in (x,y) space. For example, the lowest > dot on the page is located at (i=0, j=0), or alternatively (x=5, y=2). > What you seem to want to do is to associate x with i and y with j. > In this case, can you actually do it? Can you associate x with either i or j? > Can you associate y with either i or j? I would argue that you cannot, because > x values vary when you vary either i or j, and so do y values. Precisely! (Thank you!) Now, maybe such an example seems too obvious to everyone because the coordinate variables are named "x" and "y", from which everyone immediately infers a Cartesian location: temperature (2,3) has coordinates X=x(2,3) and Y=y(2,3) "Sure, I know where that goes!" they would say. But if I just change the strings to something less "familiar", what would they say? : dimensions: a = 10; b = 5; variables: float A(a,b); A:coordinates = "abc,def"; float abc(a,b); float def(a,b); I haven't changed the data or "coordinates" a bit, but there's no "obvious" inferences this time. This is what basic application software would be faced with. > Your main point appears to me to be that an application must somehow be able > to gain enough information to be able to produce useful outputs (plots, for > example). I totally agree with this. I would suggest that to do this, the > netCDF fragment above should be extended to give the x and y variables > attributes which provide this data. For example: > > float x(j,i); > x:coordinate_type = "Cartesian X"; > float y(j,i); > y:coordinate_type = "Cartesian Y"; > > The strings "Cartesian X" and "Cartesian Y" would be special strings recognised > by the application. However, please note that these are only suggestions on my > part. Others might want to use different attributes, or existing ones (the units, > or long_name, for example). I suspect that substantial effort is needed to generate > a proposal which formalises such a mechanism. I tend to like some of the suggestions > which identify groupings of coordinate variables into coordinate systems - we have > ourselves been thinking about how best to do this. I also think that these are > somewhat separate issues from the more fundamental identification of coordinate > variables. Of course, by itself, a proposal which only allows you to identify > coordinate variables but not interpret them is not much use. Several useful concepts have been mentioned throughout these discussions, and some of them are applicable here: 1. we need to know what the "correct" (ie, intended) coordinate system is for the variable (eg, Cartesian, cylindrical, etc.) (Not that you could "force" a different interpretation, but one "system" ought to "natural".) 2. to which axis does each "coordinate variable" (however they are identified) apply? 3. if there are several "coordinate variables", which ones make sense to use together as a "system"? There seems to have been many objections to the use of "referential attributes" raised to date, but I'm not sure any of them are legitimate. You recently (in a recent mail) also raised an objection, namely, that such an approach could not handle the case where the dimensionality of the space in which that data resides exceeds the dimensionality of the array containing the data because "you can't have more coordinate variables than you have dimensions". I may be wrong, but I don't think the original idea of referential attributes mandates such a restriction. Take, for example, temperature along a particle trajectory: dimensions: point = 100 ; variables: float temp(point) ; temp:point="lat,lon" ; float lat(point) ; float lon(point) ; Now, that pretty clearly says that "lat" and "lon" contain some sort of coordinate information for "temp", without the need for a new attribute. This is not really *sufficient*, mind you, and it's not as clear an association as this: dimensions: i = 10 ; j = 10 ; variables: float temp(j,i) ; temp:j="lat" ; temp:i="lon" ; float lat(j,i) ; float lon(j,i) ; but it does get you part of the way there. In any case, we still need to answers to questions 1-3 above. We might want to exploit the approach used by the Iris Explorer visualization package. Each "chunk" of data is contained within a "lattice". The data portion of the lattice is an array described by the number of axes and their lengths. In the the "coordinate" section of the lattice, one describes the dimensionality of the space in which the data resides. In the most common case, one might have, say, a 3-dimensional array of temperatures which represent a 3-dimensional spatial distribution with a regular grid. In this case, all one might need is 3 vectors containing the coordinates along the three axes. These are called "perimeter coordinates". However, there is a great deal of flexibility. If our temperature data is 3-D (eg, temp(i,j,k)), but the grid is not regular, you can use "curvilinear coordinates", where the X, Y, and Z coordinates can vary from point to point. This means you are basically storing about 4 times the amount of information, since you now have X(i,j,k), Y(i,j,k), and Z(i,j,k). Or, go back to the trajectory example, where your data is stored as a 1-D array, "temp(i*j*k)". Again, you can use curvilinear coordinates to store the locations of each point. Now, Explorer's view is basically Cartesian, so that simplifies things. It is up to the user to, for instance, project the data onto a sphere. But the separation of the dimensionality of the data from the dimensionality of the coordinate space is potentially a very useful model. Whew, I didn't mean to go on so long about that, but maybe it will initiate some new line of thinking. (That's what most of these exchanges are all about anyhow, right?) If I have time, I'll try to come up with some mapping of these concepts including those from GDT, Gary Granger, and questions 1-3 above, to some examples, including (time permitting) John Caron's set. Cheers- John (jps@gfdl.gov) Geophysical Fluid Dynamics Laboratory/NOAA Princeton University/Forrestal Campus/Rte. 1 P.O. Box 308 Princeton, NJ, USA 08542