Based on some suggestions by Steve Hankin, I have a proposal for an unobtrusive
way to deal with sigma coordinates and general curvilinear grids in NetCDF.
Please read this over and let me know what you think. Obviously, it's better
to get some sort of agreement or at least acquiescence from the community
rather than just go off into ad-hoc land.
THE PROBLEM
Currently, the only standard way to specify positions of a the vertices of a
grid is through the use of dimension variables. These are defines as variables
of a single dimension where the variable has the same name as the dimension.
It is easy to recognize these variables and associate them with variables that
represent fields by comparing the names of the set of dimension variables with
the names of the dimensions over which each variable is defined.
This means that one can define grids where the positions of vertices can be
expressed as
x = f(i)
y = g(j)
z = h(k)
where i, j, and k are computational coordinates and x, y, and z are spatial
coordinates. This does all rectilinear grids where the axes of the grids are
aligned with the principal axes of the space, whether the grids are uniform or
nonuniform. The dimension variables sample the axis mapping functions f, g,
and h at integral computational dimensions.
General single-block curvilinear grids need to be specified by
x = f(i, j, k)
y = g(i, j, k)
z = h(i, j, k)
The sigma grid is an important special case which can be specified by
x = f(i)
y = g(j)
z = h(i, j, k)
where h is linear over k and arbitrary over i and j.
There is no reason that variables cannot be used to sample these
multidimensional functions as well as dimension variables; however, there is no
way analogous to the dimension variable mechanism automatically to recognize
that a variable over more than one dimension defines a position within a grid.
In addition, I have the constraint that I want my program to be able to
recognize such a grid automatically without extra information provided by the
user, which of course does not preclude the possibility that the user may want
to change something later.
BACKGROUND
The proposal is in the context of another heuristic that I use for vector data.
I assume that variables are defined over three kinds of dimensions:
spatial dimensions
time dimensions
component dimensions
Component dimensions are used simply to select components of a vector dataset,
for example, window velocity. Time dimensions are used to specify time steps.
Spatial dimensions are used to specify positions in computational space.
A time dimension is recognized by being the unlimited timension.
A component dimension is recognized by being either the first or last
dimension (with the exception of a possible unlimited dimension), having
no associated dimension variable, and having a rank less than or equal to
the number of spatial dimensions of the problem (usually 1, 2, or 3).
For the purposes of the proposal, time and spatial dimensions can be treated
the same, but component dimensions are special.
THE PROPOSAL
Define a family of attribute names:
location_x
location_y
location_z
location_#
location
where # stands for a nonnegative integer. When a variable is used to define a
field, give this variable an attribute with one of the names. Within the
attribute store a string of characters, which give the name of an additional
variable that is used to store the locations of the dimension.
The location_x, location_y, location_z, and location_# variables define scalar
locations along one spatial axis, where the spacial axis is given by whatever
comes after the underscore. The last form is to provide for an arbitrarily
large number of dimensions (x is 0, y is 1, z is 2, etc.) or for non-Cartesian
coordinate systems where the axes are defined somewhere else. Each of these
attributes must name another variable, which must be a floating point or
double scalar defined over some subset (not necessarily proper) of the spatial
and time dimensions over which the field variable is defined.
The location variable defines vector locations. It works just like the scalar
ones, except there is an additional component dimension to specify components.
EXAMPLE
A sigma grid, slightly modified from an example Steve sent me
dimensions:
zax = 6 ; // or whatever dimension name //
xax = you-name-it;
yax = whatever;
variables:
float zax(zax) ;
zax: units = "LEVEL";
float u(zax,yax,xax);
u:units="meters/second";
u:location_z="u_depth"; //<===!!//
float u_depth(zax,yax,xax);
u_depth:units = "meters";
data:
zax = 1, 2, 3, 4, 5, 6;
IMPLEMENTATION
The reader program, when encountering a variable, would generate two sets of
hypothetical spatial locations, one from looking at the attributes of the
variable and another from looking for dimension variables corresponding to the
dimensions of the field. It would then generate a composite set, giving
priority to locations found within the attributes. Depending on the program,
it would store either the composite set or the composite set plus the dimension
variable set in the field for later use.
STRENGTHS
This is automatic enough for a simple robot to figure out. It isn't too hard
to write. It doesn't get in the way of anything. You can always write files
that work somewhat whether or not the program understands the convention by
specifying the locations redundantly, both in the attributes and in dimension
variables. Because I don't need to know more details about the referenced
dataset than its name simply to connect the two, this mechanism can be used to
specify grids to go with datasets when the grids and the datasets are in
different files. Grids can as easily be time-dependent as not.
SHORTCOMINGS
It does require name pattern matching, but that's a fairly minor piece of
voodoo. It doesn't represent the sigma files in the most possibly compact
form, but that's not that big a deal.
Eric Pepke INTERNET: pepke@xxxxxxxxxxxx
Supercomputer Computations Research Institute MFENET: pepke@fsu
Florida State University SPAN: scri::pepke
Tallahassee, FL 32306-4052 BITNET: pepke@fsu
Disclaimer: My employers seldom even LISTEN to my opinions.
Meta-disclaimer: Any society that needs disclaimers has too many lawyers.