This is a multi-part message in MIME format. --------------2781446B794B Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Attached is a long attempt at defining coordinate systems in a formalized way, along with proposals for (what else?) netcdf conventions on coordinate variables, and generalized coordinate systems. Im a bit rusty at this sort of thing, so Im hoping others might have a look at it and give me some feedback. Perhaps someone somewhere else has made a formalized specification in a more succinct way. If so, I'd appreciate a pointer to it. Anyway, I'm muddling around trying to capture what a coordinate system is in a precise way, trying to make it as general as possible. I might be wrong on some fundamental level, and i'd appreciate understanding that if you can explain it. Thanks! (I couldnt read that attachment, so I'll just resend it here again. Sorry for the duplication). --------------2781446B794B Content-Type: text/plain; charset=us-ascii; name="coordvar" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="coordvar" -------- Dimension A _dimension_ is a named range of integers = {0,1,..size-1}. A dimension is completely specified by the pair (name, size). You can substitute {1..size} in what follows if you prefer 1-based indexing. -------- Variable A _variable_ is a function whose domain is D0 x D1 x D2 x .. x Dn = D, where the Di are the dimensions of the variable, and n is its _rank_. To include scalar variables of rank 0, we define D0 = {0}. We can thus write a variable v in functional form as v = f(D) -> R, where f denotes the function, and R is the range. We will use v as identical to f in what follows. In the context of netcdf files, we represent functions as scalar arrays, and so are limited to directly representing only scalar functions; some further convention is needed for vector functions. ------------------ Coordinate Variable A _coordinate variable_ is a variable that assigns physical values to a dimension. It must be a strictly increasing or decreasing function, and has domain consisting of a single dimension: CVi(Di) -> Ri so that CVi is said to be a coordinate variable for dimension Di. ----------------- Coordinate System If V is a vector space, a _coordinate system_ for V is a set of basis vectors for V, along with units to give each coordinate physical meaning. A _coordinate_ here is a synonym for basis vector. Let D be a domain, D = D1 x D2 x .. Dn, and define a set of scaler _coordinate functions_ fi(D) -> Ri. Let V be the vector space (R1, R2,.. , Rn). Then the vector function Fcs = (f1, f2, ..., fn) is said to be a coordinate system for D, Fcs(D) -> V, if Fcs is invertible. Given the discrete nature of D, Fcs is invertible if it is one-to-one, meaning Fcs maps each point in D to a unique point in V. Given a coordinate system Fcs for domain Dc, a variable v with domain Dv, and Dc a subset of Dv, then Fcs is a coordinate system for v. If Dc = Dv, then Fcs is a _complete_ coordinate system for v. The value Fcs(di) = vi for a particular value di in the domain is the _position vector_ for di, and the variable is said to be located at vi for point di, with respect to the coordinate system Fcs. (I think "Dc is a subset if Dv" is not quite right; I probably want to restrict Dc = D1 x D2 x .. Dk to be equal to Dv = D1 x D2 x .. Dn, with just some dimension Di missing). A special case of a coordinate system is one where the coordinate functions are coordinate variables, and so depend on a single domension Di. Then Fcs(D1 x D2 x .. x Dn) = (f1(D1), f2(D2), ... fn(Dn)), and Fcs is said to be an _independent_ coordinate system. --------------------------- Coordinate Transformations A coordinate transformation is an invertible mapping M, between two coordinate systems. Fcs1 and Fcs2: Fcs1 = M * Fcs2, M-1 * Fcs1 = Fcs2. Here * is functional composition, and M-1 indicates the inverse of M. ------------------------------- Georeferencing Coordinate System In a georeferencing coordinate system, or GCS for short, there are 3 spatial dimensions x,y,z, which correspond as much as possible to the directions "east/west", "north/south" and "up/down", respectively. A GCS is therefore a function Fgcs(D) -> (x,y,z) where x,y,z describe the variable's position or spatial extent in each of the directions. Note that if describing spatial extent, two values are needed for each direction, eg x = (xleft,xright) or z = (zhigh,zlow). ============================================ Specifying Coordinate Systems in netcdf files. We have seen that a general coordinate system is specified by a domain D = D1 x D2 x .. Dn, a vector space V (and associated physical units for the basis functions), and an invertible function Fcs(D) -> V. Netcdf semantics map domains to named dimensions, and units for coordinates are also very well done. Variable arrays are fine for describing single-valued functions. All that's really missing are vector valued functions. Here is a proposal for a netcdf convention for specifying coordinate systems. The goal is to 1) build from existing practices. 2) keep simple things simple 3) make it flexible enough to handle any coordinate system. So the proposal is: 1) coordinate variables remain an elegent way to define the coordinate system when possible. 2) allow the natural extension of coordinate variables to higher dimensions. Formally: "A variable with the same name as a dimension is the coordinate variable for that dimension. If V is a variable with domain D1 x D2 .. Dn = D, let Dc be the subset of D with coordinate variables defined. Then a coordinate system is defined on Dc with the function Fcs(Dc) = (cv1(D1), cv2(D2) ...) where the cvi's are the defined coordinate variables, and the Di's are each subsets of D. For any such Dc, Fcs must be invertible." You notice that coordinate variables are restricted to mapping D (in index space) to D (in physical coordinate space). This is a Good thing, and we try hard to define our dimensions so that we can do exactly that. 3) more generally, allow the specification of coordinate systems using attributes: "A coordinate system can be defined by an attribute whose name starts with the string 'coordinates' (case insensitive, optional trailing description) and whose value is a (comma or blank delimited) list of variable names in the same file that define the coordinate functions. The domain Dc of the coordinate system is found by forming the product of the set of any Di that is contained within the domains of the coordinate functions. The coordinate system is defined by the function Fcs(Dc) = (cv1(D1), cv2(D2) ...) where the cvi's are the named coordinate functions" This is meant to cover William Weibel's case of: dimensions: npoints = 541; variables: lon(npoints); lat(npoints); geopotential(npoints); geopotential:coordinates = "lon lat"; and presumably any other coordinate system (?). It seems likely that the case var(dim, dim) would have to be excluded, ie using the same dimension twice in a variable declaration (?). 4) allow vector valued coordinates, to cover the famous (gen_time, valid_time) from NUWG: "A vector valued coordinate function can be specified by enclosing in parentheses a list of variables in the same file that define each component of the coordinate function. Eg: geopotential:coordinates = "lon lat (gen_time, valid_time)"; I still want to: 5) allow the specification of extents, as well as point positions for a coordinate function. 6) clarify a number of special things about georeferencing coordinate systems but I'm running out of gas, and Im not totally sure this whole thing is solid. So I'll stop and see if anyone can give me feedback one way or the other. --------------2781446B794B--