John, > Attached is a long attempt at defining coordinate systems in a > formalized way, along with proposals for (what else?) netcdf conventions > on coordinate variables, and generalized coordinate systems. > > Im a bit rusty at this sort of thing, so Im hoping others might have a > look at it and give me some feedback. Perhaps someone somewhere else > has made a formalized specification in a more succinct way. If so, > I'd appreciate a pointer to it. It's not formalized, but there's a `Coordinate Systems Overview' at: http://www.utexas.edu/depts/grg/gcraft/notes/coordsys/coordsys.html that describes lots of coordinate systems used for geography and geodesy. Some of these seem so complex or ad hoc that we probably would not want a set of netCDF coordinate conventions extensive enough to encompass all of them. > Anyway, I'm muddling around trying to capture what a coordinate system > is in a precise way, trying to make it as general as possible. I might > be wrong on some fundamental level, and i'd appreciate understanding > that if you can explain it. Thanks! I think you've got rectilinear coordinate systems specified clearly, but there may be a problem in trying to use vector spaces and linear algebra terminology to define coordinate systems that aren't vector spaces. What are the basis vectors for a coordinate system based on (lat, lon, height)? They can't be (1, 0, 0), (0, 1, 0), and (0, 0, 1), in (radians, radians, meters) because in a vector space, every vector has a unique representation as a linear combination of the basis vectors, but (lat, lon, height) and (lat, lon+2*pi, height) represent the same element. I would also like to consider the possibility of a more general notion of coordinates, for example treating climatology data so that `month' could be a dimension with a corresponding coordinate variable in a dataset such as: ... dimensions: lat = 19; lon = 36; month = 12; variables: float average_temperature(month, lat, lon); // coordinate variables float lat(lat); float lon(lon); // `month' doesn't currently qualify as a coordinate variable, char month(3,month) = "jan","feb","mar",...,"dec"; ... Here `month' might be considered a _nominal_ coordinate variable, from a useful categorization of value types that Harvey Davies once pointed out: nominal: Values are not ordered, e.g. `country'. Operations such as min, max, and sort are not defined for such data. `Closest to' must be an exact match. ordinal: Data can be ordered, but not sensibly subtracted, e.g. `house_number' in street addresses or `FAA_level_number'. Such data can not be interpolated. interval: Subtraction of values are meaningful, but ratios are not, e.g. Celsius temperatures. Such data can be interpolated. ratio: Ratio of data values meaningful, e.g. Kelvin temperatures. Logarithms and geometric means are possible for such data. Coordinate variables may make sense for all of these categories, but for nominal or ordinal coordinates, vector spaces don't seem to apply. Harvey proposed a `measurement_level' attribute to specify the value type according to this terminology, so an application would not attempt meaningless operations on inappropriate data values or coordinates. I agree with others in this thread that the simple netCDF conventions for coordinates are currently too limited for some uses. Extended conventions may eliminate some of these limitations, but extensions must be adopted carefully. A new convention requires support from existing and future netCDF software, making such software more difficult to develop and maintain. As a small step toward moving closer to resolution of extending the netCDF conventions for coordinates, I have put together and will maintain a Web page linking to netcdfgroup postings relevant to this subject: http://www.unidata.ucar.edu/software/netcdf/coords/ Reading through these, it's clear that some of the older postings address the same issues as recent postings and propose similar solutions. For example, Richard Signell's 1992 posting `Suggestion for Coordinate Mapping convention' and Eric Pepke's 1994 posting `Sigma and Curvilinear Grids' propose conventions relevant to the current discussion. Lloyd Treinish's 1992 posting `netCDF and "complex" data' provides some elaborate examples of the power of well-designed referential attributes. I'll also try to maintain proposals for extensions to netCDF coordinate system conventions from a Web page, so those who are interested can help to refine them without needing to include all the preceding context in every posting. (I believe there will always be a need for evolving other discipline-specific conventions, for which existing mechanisms seem adequate.) Current candidates for convention extensions include multidimensional coordinate variables and referential attributes. If neither of these turns out to be adequate for solving most problems of interest, I'm not sure we would be better off adopting both of them. It might be better to just document them more clearly so that datasets can use them, applications can support them, and future data users have a common understanding of of what these sorts of conventions mean and when they are useful. --Russ _____________________________________________________________________ Russ Rew UCAR Unidata Program russ@unidata.ucar.edu http://www.unidata.ucar.edu