Re: coordinate systems in netcdf (again)

Attached is a long attempt at defining coordinate systems in a
formalized way, along with proposals for (what else?) netcdf conventions
on coordinate variables, and generalized coordinate systems.

Im a bit rusty at this sort of thing, so Im hoping others might have a
look at it and give me some feedback.  Perhaps someone somewhere else
has made a formalized specification in a more succinct way.  If so,
I'd appreciate a pointer to it.

Anyway, I'm muddling around trying to capture what a coordinate system
is in a precise way, trying to make it as general as possible.  I might
be wrong on some fundamental level, and i'd appreciate understanding
that if you can explain it.  Thanks!

(I couldnt read that attachment, so I'll just resend it here again.
Sorry for the 
duplication).
--------
Dimension
   A _dimension_ is a named range of integers = {0,1,..size-1}. A dimension
is completely specified by the pair (name, size). You can substitute {1..size}
in what follows if you prefer 1-based indexing. 

--------
Variable

   A _variable_ is a function whose domain is D0 x D1 x D2 x .. x Dn = D,
where the Di are the dimensions of the variable, and n is its _rank_.
To include scalar variables of rank 0, we define D0 = {0}.
We can thus write a variable v in functional form as v = f(D) -> R,
where f denotes the function, and R is the range. We will use v as
identical to f in what follows.

   In the context of netcdf files, we represent functions as scalar arrays,
and so are limited to directly representing only scalar functions; some further
convention is needed for vector functions.

------------------
Coordinate Variable

   A _coordinate variable_ is a variable that assigns physical values to a 
dimension. 
It must be a strictly increasing or decreasing function, and has domain 
consisting of a 
single dimension:  CVi(Di) -> Ri so that CVi is said to be a coordinate 
variable for 
dimension Di. 

-----------------
Coordinate System

   If V is a vector space, a _coordinate system_ for V is a set of basis 
vectors for V, 
along with units to give each coordinate physical meaning. A _coordinate_ here 
is a synonym
for basis vector.
   
   Let D be a domain, D = D1 x D2 x .. Dn, and define a set of scaler 
_coordinate functions_
fi(D) -> Ri.  Let V be the vector space (R1, R2,.. , Rn).  Then the vector 
function 
Fcs = (f1, f2, ..., fn) is said to be a coordinate system for D, Fcs(D) -> V, 
if Fcs is 
invertible. Given the discrete nature of D, Fcs is invertible if it is 
one-to-one, meaning
Fcs maps each point in D to a unique point in V. 

   Given a coordinate system Fcs for domain Dc, a variable v with domain Dv, 
and Dc a 
subset of Dv, then Fcs is a coordinate system for v. If Dc = Dv, then Fcs is a 
_complete_ 
coordinate system for v.  The value Fcs(di) = vi for a particular value di in 
the domain 
is the _position vector_ for di, and the variable is said to be located at vi 
for point di,
with respect to the coordinate system Fcs.  (I think "Dc is a subset if Dv" is 
not quite 
right; I probably want to restrict Dc = D1 x D2 x .. Dk to be equal to Dv = D1 
x D2 x .. Dn,
with just some dimension Di missing).

   A special case of a coordinate system is one where the coordinate functions 
are 
coordinate variables, and so depend on a single domension Di.  Then 
Fcs(D1 x D2 x .. x Dn) = (f1(D1), f2(D2), ... fn(Dn)), and Fcs is said to be an 
_independent_ coordinate system.

---------------------------
Coordinate Transformations

A coordinate transformation is an invertible mapping M, between two coordinate 
systems.
Fcs1 and Fcs2:  
        Fcs1 = M * Fcs2,  M-1 * Fcs1 = Fcs2.
Here * is functional composition, and M-1 indicates the inverse of M.

-------------------------------
Georeferencing Coordinate System

   In a georeferencing coordinate system, or GCS for short, there are 3 spatial 
dimensions x,y,z, which correspond as much as possible to the directions 
"east/west", 
"north/south" and "up/down", respectively.  A GCS is therefore a function
        Fgcs(D) -> (x,y,z)
where x,y,z describe the variable's position or spatial extent in each of the 
directions.
Note that if describing spatial extent, two values are needed for each 
direction, eg
x = (xleft,xright) or z = (zhigh,zlow).

===========================================
Specifying Coordinate Systems in netcdf files.

   We have seen that a general coordinate system is specified by a domain 
D = D1 x D2 x .. Dn, a vector space V (and associated physical units for the 
basis 
functions), and an invertible function Fcs(D) -> V.  Netcdf semantics map 
domains to 
named dimensions, and units for coordinates are also very well done.  Variable 
arrays are 
fine for describing single-valued functions.  All that's really missing are 
vector valued 
functions.

   Here is a proposal for a netcdf convention for specifying coordinate 
systems. 
The goal is to
        1) build from existing practices.
        2) keep simple things simple
        3) make it flexible enough to handle any coordinate system.

   So the proposal is:

        1) coordinate variables remain an elegent way to define the coordinate 
system when 
possible.

        2) allow the natural extension of coordinate variables to higher 
dimensions. 
Formally:
          "A variable with the same name as a dimension is the coordinate 
variable for that
        dimension. If V is a variable with domain D1 x D2 .. Dn = D, let Dc be 
the subset 
        of D with coordinate variables defined. Then a coordinate system is 
defined on Dc 
        with the function 
                Fcs(Dc) = (cv1(D1), cv2(D2) ...)
        where the cvi's are the defined coordinate variables, and the Di's are 
each subsets
        of D. For any such Dc, Fcs must be invertible."

        You notice that coordinate variables are restricted to mapping D (in 
index space) 
        to D (in physical coordinate space).  This is a Good thing, and we try 
hard to 
        define our dimensions so that we can do exactly that.

        3) more generally, allow the specification of coordinate systems using 
attributes:

            "A coordinate system can be defined by an attribute whose name 
starts with the 
        string 'coordinates' (case insensitive, optional trailing description) 
and whose 
        value is a (comma or blank delimited) list of variable names in the 
same file that 
        define the coordinate functions.  The domain Dc of the coordinate 
system is found 
        by forming the product of the set of any Di that is contained within 
the domains of 
        the coordinate functions. The coordinate system is defined by the 
function
                Fcs(Dc) = (cv1(D1), cv2(D2) ...)
        where the cvi's are the named coordinate functions"

        This is meant to cover William Weibel's case of:
                dimensions:
                   npoints = 541;
                variables:
                   lon(npoints);
                   lat(npoints);
                   geopotential(npoints);
                        geopotential:coordinates = "lon lat";

        and presumably any other coordinate system (?). It seems likely that 
the case
        var(dim, dim) would have to be excluded, ie using the same dimension 
twice
        in a variable declaration (?).

        4) allow vector valued coordinates, to cover the famous (gen_time, 
valid_time) 
        from NUWG:

           "A vector valued coordinate function can be specified by enclosing 
in 
        parentheses a list of variables in the same file that define each 
component of 
        the coordinate function. Eg:
               geopotential:coordinates = "lon lat (gen_time, valid_time)";


        I still want to:
           5) allow the specification of extents, as well as point positions 
for a 
        coordinate function.
           6) clarify a number of special things about georeferencing 
coordinate systems
        
        but I'm running out of gas, and Im not totally sure this whole thing is 
solid.
        So I'll stop and see if anyone can give me feedback one way or the 
other.
  • 1997 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: