Based on some suggestions by Steve Hankin, I have a proposal for an unobtrusive way to deal with sigma coordinates and general curvilinear grids in NetCDF. Please read this over and let me know what you think. Obviously, it's better to get some sort of agreement or at least acquiescence from the community rather than just go off into ad-hoc land. THE PROBLEM Currently, the only standard way to specify positions of a the vertices of a grid is through the use of dimension variables. These are defines as variables of a single dimension where the variable has the same name as the dimension. It is easy to recognize these variables and associate them with variables that represent fields by comparing the names of the set of dimension variables with the names of the dimensions over which each variable is defined. This means that one can define grids where the positions of vertices can be expressed as x = f(i) y = g(j) z = h(k) where i, j, and k are computational coordinates and x, y, and z are spatial coordinates. This does all rectilinear grids where the axes of the grids are aligned with the principal axes of the space, whether the grids are uniform or nonuniform. The dimension variables sample the axis mapping functions f, g, and h at integral computational dimensions. General single-block curvilinear grids need to be specified by x = f(i, j, k) y = g(i, j, k) z = h(i, j, k) The sigma grid is an important special case which can be specified by x = f(i) y = g(j) z = h(i, j, k) where h is linear over k and arbitrary over i and j. There is no reason that variables cannot be used to sample these multidimensional functions as well as dimension variables; however, there is no way analogous to the dimension variable mechanism automatically to recognize that a variable over more than one dimension defines a position within a grid. In addition, I have the constraint that I want my program to be able to recognize such a grid automatically without extra information provided by the user, which of course does not preclude the possibility that the user may want to change something later. BACKGROUND The proposal is in the context of another heuristic that I use for vector data. I assume that variables are defined over three kinds of dimensions: spatial dimensions time dimensions component dimensions Component dimensions are used simply to select components of a vector dataset, for example, window velocity. Time dimensions are used to specify time steps. Spatial dimensions are used to specify positions in computational space. A time dimension is recognized by being the unlimited timension. A component dimension is recognized by being either the first or last dimension (with the exception of a possible unlimited dimension), having no associated dimension variable, and having a rank less than or equal to the number of spatial dimensions of the problem (usually 1, 2, or 3). For the purposes of the proposal, time and spatial dimensions can be treated the same, but component dimensions are special. THE PROPOSAL Define a family of attribute names: location_x location_y location_z location_# location where # stands for a nonnegative integer. When a variable is used to define a field, give this variable an attribute with one of the names. Within the attribute store a string of characters, which give the name of an additional variable that is used to store the locations of the dimension. The location_x, location_y, location_z, and location_# variables define scalar locations along one spatial axis, where the spacial axis is given by whatever comes after the underscore. The last form is to provide for an arbitrarily large number of dimensions (x is 0, y is 1, z is 2, etc.) or for non-Cartesian coordinate systems where the axes are defined somewhere else. Each of these attributes must name another variable, which must be a floating point or double scalar defined over some subset (not necessarily proper) of the spatial and time dimensions over which the field variable is defined. The location variable defines vector locations. It works just like the scalar ones, except there is an additional component dimension to specify components. EXAMPLE A sigma grid, slightly modified from an example Steve sent me dimensions: zax = 6 ; // or whatever dimension name // xax = you-name-it; yax = whatever; variables: float zax(zax) ; zax: units = "LEVEL"; float u(zax,yax,xax); u:units="meters/second"; u:location_z="u_depth"; //<===!!// float u_depth(zax,yax,xax); u_depth:units = "meters"; data: zax = 1, 2, 3, 4, 5, 6; IMPLEMENTATION The reader program, when encountering a variable, would generate two sets of hypothetical spatial locations, one from looking at the attributes of the variable and another from looking for dimension variables corresponding to the dimensions of the field. It would then generate a composite set, giving priority to locations found within the attributes. Depending on the program, it would store either the composite set or the composite set plus the dimension variable set in the field for later use. STRENGTHS This is automatic enough for a simple robot to figure out. It isn't too hard to write. It doesn't get in the way of anything. You can always write files that work somewhat whether or not the program understands the convention by specifying the locations redundantly, both in the attributes and in dimension variables. Because I don't need to know more details about the referenced dataset than its name simply to connect the two, this mechanism can be used to specify grids to go with datasets when the grids and the datasets are in different files. Grids can as easily be time-dependent as not. SHORTCOMINGS It does require name pattern matching, but that's a fairly minor piece of voodoo. It doesn't represent the sigma files in the most possibly compact form, but that's not that big a deal. Eric Pepke INTERNET: pepke@scri.fsu.edu Supercomputer Computations Research Institute MFENET: pepke@fsu Florida State University SPAN: scri::pepke Tallahassee, FL 32306-4052 BITNET: pepke@fsu Disclaimer: My employers seldom even LISTEN to my opinions. Meta-disclaimer: Any society that needs disclaimers has too many lawyers.