Re: global map attributes

John Caron (caron@ucar.edu)
Mon, 14 Jul 1997 15:51:02 -0600

Hi gary:
Thanks for your contributions. Below are some questions and comments:

> 
> ----------------------------------------------------------------------------
> 
> A first example is a 3-D temperature grid for an ocean model. (I'm just a
> programmer so the real-life applicability of the examples may be off. :)
> Suppose the grid is curvilinear with sigma levels, defined with dimensions
> (i, j, k). But for analysis or plotting the grid may need to be viewed over
> lat/lon and either sigma or depth for the height, where the lat, lon, sigma,
> and depth mappings are stored directly in the file as variables. Lat and lon
> depend upon the (j,k) coordinate, depth depends upon (i,j,k), sigma depends
> upon (i).
> 
>    * lat(j,k)
>    * lon(j,k)
>    * depth(i,j,k)
>    * sigma(i)
> 
> There are at least four coordinate systems which associate the manifold
> domain with an alternate base domain (using the wording Steve mentioned),
> including the identity coordinate system.
> 
>  (i,j,k) => (depth, lon, lat)
>  (i,j,k) => (sigma, lon, lat)
>  (i,j,k) => (i, lon, lat)
>  (i,j,k) => (i,j,k)
> 
> So to associate each alternate mapping with the temperature variable, each
> mapping is named in the temp variable's temp attribute (except for the
> identity since it's implicit):
> 
>         float temp(i, j, k);
>                 temp:temp = "depth-map sigma-map lat-lon-map";
> 
> The mapping name refers to a global attribute which lists the coordinates of
> the system. I appended -map to the names for clarity; I'm not suggesting the
> convention require it.
> 
>         :depth-map = "depth lon lat";
>         :sigma-map = "sigma lon lat";
>         :lat-lon-map = "lon lat";
>         :sigma = "sigma";
> 
>         float lat(j, k);
>         float lon(j, k);
>         float sigma(i);
>         float depth(i,j,k);
>                 depth:depth = "sigma";
> 
> Each coordinate name in a global map attribute refers to the variable which
> maps to the values of that coordinate. There is some important consistency
> here: a variable is always a mapping (from N-space to the 1-D range of the
> variable), and the global map attributes are always used to specify the
> components or range for a mapping, e.g., the coordinates of a coordinate
> system.

I would restate as: "a variable and a coord sys for the variable can both be
considered functions on (more or less) the same index domain". I just wanted to
check if you are saying more?

> 
> Note that the lat-lon-map is independent of the i coordinate, in which case
> the i coordinate is added to the lat-lon-map as an identity mapping of the
> indices along i. E.g., point (1,1,1) of temp is at coordinate (1, -105.0,
> 40.1) in the lat-lon-map coordinate system, point (2,j,k) is at (2,
> lon(j,k), lat(j,k)).
> 
> A user or application can deduce the coordinates on the depth grid of point
> (i,j,k) by calculating (depth(i,j,k), lon(j,k), lat(j,k)). Choosing
> sigma-map yields (sigma(i), lon(j,k), lat(j,k)).
> 
> The sigma global map attribute indicates that it makes sense to plot depth
> along a sigma axis, using the values of the sigma variable as the axis
> coordinates. This is an example of the recursiveness allowed by the
> convention. The temp variable's temp attribute can be followed to the
> depth-map mapping, which points to the depth variable. But the depth
> variable itself has an alternate coordinate system, sigma, whose global map
> attribute refers directly to the sigma variable. Through this recursion an
> application could infer the availability of the sigma-map mapping without it
> being explicitly named in the temp attribute.

I dont understand what the point of the lat-lon map or the sigma-map coord
systems are. It seems like they are implied by the depth-map and the 
sigma-map coord systems?

Also, the need for the identity coord function is not obvious. I would just 
say that the lat-lon coord sys is not complete. I am also noticing that the
identity coord function doesnt really "assign a physical value", which is
my definition of a coord system.  I acknowledge you might need it now and
then for ill-specified files, but I dont really think it deserves much of
a place in our conventions.


> 
> Note that this example doesn't change much if some of the variables are
> dependent upon time. If the lat and lon variable mappings are static, the
> coordinate systems would look like below.
> 
> dimensions:
>         time (UNLIMITED);
>         i = 10;
>         j = 100;
>         k = 100;
> 
> variables:
> 
>         float time(time);
>         float temp(time, i, j, k);
>                 temp:temp = "depth-map sigma-map lat-lon-map";
>         float salinity(time, i, j, k);
>                 temp:temp = "depth-map sigma-map lat-lon-map";
>         float density(time, i, j, k);
>                 temp:temp = "depth-map sigma-map lat-lon-map";
>         float depth(time, i,j,k);
>                 depth:depth = "sigma";
>         float lat(j, k);
>         float lon(j, k);
>         float sigma(i);
> 
>         :depth-map = "depth lon lat";
>         :sigma-map = "sigma lon lat";
>         :lat-lon-map = "lon lat";
>         :sigma = "sigma";
>         :time = "time";
> 
> The time mapping attribute is essentially an encoding of the usual
> interpretation of 1-D coordinate variables. For backwards compatibility, the
> existence of a coordinate variable could imply this attribute. Since none of
> the alternate mappings for temp depend upon the time coordinate, time is
> looked up as its own mapping, and the time global attribute refers back to
> the time variable.
> 
> The temp coordinate systems could also be named "more completely" as
> follows:
> 
>         :depth-map = "time depth lon lat";
>         :sigma-map = "time sigma lon lat";
>         :lat-lon-map = "time i lon lat";
> 
> This is perhaps clearer and might be preferred by some users. The i
> component in the lat-lon-map does not refer to another variable or mapping,
> so it reverts to the identity mapping from the domain of the i coordinate.
> (The identity mapping is effectively the default coordinate system for a
> dimension.)
> 
> Comments
> 
> I thought about allowing the mapping names to refer directly to the
> variable's attributes rather a global attribute, but in my opinion it's
> better to make a mapping name unique for the whole file and avoid allowing
> conflicting uses of map names between variables. I also considered letting a
> mapping name refer directly to a variable name, so that these global
> attributes would be unnecessary.
> 
>         :sigma = "sigma";
>         :time = "time";
> 
> But that violates the orthogonality and consistency of mapping names always
> being global attributes. I think some confusion could be avoided by naming
> every mapping explicitly as a global attribute, even those mappings
> currently implied by the coordinate variable convention.

I think its important to have global specification with variable override.
Imagine you have a file with 150 variables in it, and you have to specify the coord
sys for each individual variable. Now you have a ncdump output: gotta check each one
to see if they are the same (barf) ! On the other hand, if you have three variables
each with a different coord system, its more compact to specify at the variable level.

> 
> Lastly, there is nothing preventing an application from merging alternate
> mappings of a variable. The temperature mappings could be listed as follows:
> 
>         float temp(time, i, j, k);
>                 temp:temp = "time depth-map sigma-map lat-lon-map";
> 
>         :depth-map = "depth";
>         :sigma-map = "sigma";
>         :lat-lon-map = "lon lat";
>         :sigma = "sigma";
>         :time = "time";
> 
> An application could assemble all possible combinations of mappings and
> their coordinates, and in any order, including:
> 
>  (time(time), i, j, k)
>  (time(time), sigma(i), j, k)
>  (time(time), depth(time,i,j,k), lat(j,k), lon(j,k))
>  (lat(j,k), lon(j,k), i, time)
>  (i, j, k, time(time), sigma(i), depth(time,i,j,k), lon(j,k), lat(j,k))
> 
> The last mapping, consisting of the intersection of all of the possible
> coordinate systems, might be useful for a table. (A really, really, big
> table. :) The mappings which can be described with this convention are
> merely a superset of those possible with the proposed coordinates
> referential attribute, which essentially lists the independent variables of
> each alternate coordinate system for a variable. I'm merely suggesting an
> extension of the referential idea to allow naming specific sets of
> coordinates, so that each coordinate set can completely and more intuitively
> describe a base coordinate system, not just for one variable but for the
> whole file.

I really think this is the wrong way to go, but as I see that you basically argue 
against it below, I'll assume you agree. But by saying that you would allow it I
think you confuse whether you are listing coordinate systems or coordinate 
functions (I think "independent variables" is not true). All your good examples
I think list coord systems.  

> 
> ----------------------------------------------------------------------------
> 
> Examples
> 
> Wire Coil
> 
> Here's Steve's wire coil example using the above ideas. I added two
> coordinate systems for the wire temperature: the physical distance along the
> wire and the cartesian coordinates.
> 
>     dimensions:
>         s = 100;
> 
>     variables:
>         float temp(s);  // temperature along spiral
>             temp:temp = "distance-map cylinder-map cartesian-map"
>         float rho(s);   // distance from CCS center axis
>         float theta(s); // CCS azimuth
>         float z(s);     // CCS height
>         float distance(s);
>                 distance:units = "cm";
>         float y(s);
>         float x(s);
> 
>         // physical distance along wire
>         :distance-map = "distance"
>         // cylindrical coordinate system (CCS)
>         :cylinder-map = "z rho theta";
>         // Cartesian map
>         :cartesian-map = "z y x";

this is good.

> 
> Rectilinear Grid
> 
> The simple coordinate variable case for a rectilinear grid
> 
>         float temp(lat,lon);
>         float lat(lat);
>         float lon(lon);
> 
> becomes:
> 
>         float temp(time,lat,lon);
>                 temp:temp = "world-map";
>         float lat(lat);
>         float lon(lon);
>         :world-map = "lon lat";
> 
> The lat and lon dimensions could be renamed without affecting the
> association described. Since the lat and lon mappings are independent of
> each other in this case, this example could boil down to exactly the
> previous referential attribute examples, except the usual coordinate
> variable convention has been made explicit with a global map attribute:
> 
>         float temp(time,lat,lon);
>                 temp:temp = "lat lon";
>         float lat(lat);
>         float lon(lon);
>         :lat = "lat";
>         :lon = "lon";
> 
> Comparisons to coordinates Attribute
> 
> The temp mappings in the wire coil example can be replaced with their
> constituent coordinates (variables), similar to the Walker and Waring
> example and to Signell's independent variables:
> 
> >       double salt(n,k,j,i);
> >             salt:long_name = "Salinity";
> >             salt:units = "1";
> >             salt:coordinates = "t cell_z cell_y cell_x cell_lat
> 
>                 temp:temp = "distance z rho theta y x"
> 
> An application can generate all possible combinations of the mappings,
> "factor out" (Russ' words) coordinates where possible and if desired, and
> allow a user to choose the ones which make most sense. However, using the
> global coordinate attribute allows the sensible mappings to be explicitly
> named once for a file and referred to directly in each variable, and the
> availability of "common" coordinate systems is more immediately evident to
> the user. Also, the first mapping in a variable's referential attribute can
> designate the default coordinate system.
> 
> Also, I'd prefer using a global attribute over using variable attributes
> with the same name as the corresponding dimension. Several fields with the
> same "manifold" domain would each need attributes for each dimension,
> instead of naming a single mapping in a single variable attribute. A global
> map attribute would be unique for the whole file, whereas dimension
> attributes among different variables would be redundant and could
> contradict. Russ gave similar arguments and others in support of the global
> dimension attribute.
> 
> >From John Caron's comments about Walker and Waring:
> >
> > One part of their example I disagree with is lumping all of the
> > coordinate functions together:
> >       double salt(n,k,j,i);
> >             salt:long_name = "Salinity";
> >             salt:units = "1";
> >             salt:coordinates = "t cell_z cell_y cell_x cell_lat
> > cell_lon";
> >
> > While there's nothing illegal about it, better is:
> >             salt:coordinates_xy = "t cell_z cell_y cell_x";
> >             salt:coordinates_latlon = "t cell_z cell_lat cell_lon";
> > emphasizing that you are specifying two coordinate systems.
> 
> I would agree with John, but I'd suggest this construction:
> 
>         double salt(n,k,j,i);
>                 salt:long_name = "Salinity";
>                 salt:units = "1";
>                 salt:salt = "lat-lon-map cartesian-map";
> 
>         :lat-lon-map = "t cell_z cell_lat cell_lon";
>         :cartesian-map = "t cell_z cell_y cell_x";
> 
> The coordinate systems are explicit, and they only need to be specified once
> rather than for each variable.

your formulation would be acceptable; i'm not convinced that the temp:temp part
is worth the trouble; theres nothing to suggest such an attribute should point
to a coordinate system. The tradeoff is explicitness (eg an attribute "coordinate")
vs avoiding "english-centricity". 

also i dont really agree/understand the notion of "factoring out".

> 
> Boundary Layers
> 
> Russ' boundary layer example using referential attributes, converted to
> global map attributes:
> 
>     dimensions:
> 
>         bndlay = 5 ;           // boundary layers
>         lon =  93 ;
>         lat =  65 ;
> 
>     variables:
> 
>         float   RH_bndlay(bndlay, lat, lon) ;
>                 RH_bndlay:long_name = "relative humidity in boundary layer" ;
>                 RH_bndlay:units = "percent" ;
>                 RH_bndlay:RH_bndlay = "boundary-map";
> 
>         float   bndlay_bot(bndlay) ;
>                 bndlay_bot:long_name = "bottom of layer" ;
>                 bndlay_bot:units = "hPa" ;
> 
>         float   bndlay_top(bndlay) ;
>                 bndlay_top:long_name = "top of layer" ;
>                 bndlay_top:units = "hPa" ;
> 
>         :boundary-map = "bndlay_bot bndlay_top lat lon";

ive been toying with the idea of introducing a grouping syntax, like:
         :boundary-map = "(bndlay_bot bndlay_top) lat lon";
but i havent yet thought it through. 


> 
> record Attribute
> 
> The NUWG conventions suggest a record attribute for model grids.
> 
>         float u(record, z, x, y);
>                 u:record = "valtime, reftime";
> 
>         double valtime(record);
>                 valtime:long_name = "valid time of model";
>                 valtime:units = "minutes since (1993-1-1 00:00:00.0)";
> 
>         double reftime(record);
>                 reftime:long_name = "reference time of model";
>                 reftime:units = "minutes since (1993-1-1 00:00:00.0)";
> 
> Using map attributes, the multiple time components become a global mapping,
> similar to Russ' example of a dimension attribute for time:
> 
>  2.  A global "dimension attribute":
> 
>      :time = "year day_of_year second_of_day";
> 
> Following are the corresponding examples using global map attributes. The
> time mappings could be combined with lat/lon or other mappings for the other
> dimensions as needed.
> 
>         variables:
>                 float u (record, z, y, x);
>                    u:u = "model-time-map";
>                 float observed_temp (record, z, y, x);
>                    observed_temp:observed_temp =
>                         "time-map model-time-map month-map";
>                 int year(record);
>                 int day_of_year(record);
>                 float second_of_day(record);
>                 char month(record,4);
> 
>         :time-map = "year day_of_year second_of_day";
>         :model-time-map = "valtime reftime";
>         :month-map = "year month";

this is a confusing example, partly because of the indirection
but once i parse it, i guess its straightforward:
(i'll eliminate the x,y,z dimensions):
we have two variables u(record) and observer_temp(record).
 u has coordinate system model-time-map, and observed_temp
has coordinate systems model-time-map, time-map, and month-map.

it would be surprising (though certainly possible) that in a real file
u would not also have the other two time coords.

> 
> Grid levels
> 
> Here is an example based on the NUWG conventions for alternate grid level
> coordinates, p or vpt.
> 
>         float u(record, z, x, y);
>                 u:z = "vpt, p";
> 
> Using the global map attributes:
> 
>         float u(record, z, x, y);
>                 u:u = "pres-grid-map vpt-grid-map";
>         float p(record, z, x, y);
>                 u:u = "vpt-grid-map";
>         float vpt(record, z, x, y);
>                 u:u = "pres-grid-map";
> 
>         :pres-grid-map = "p x y";
>         :vpt-grid-map = "vpt x y";
> 
> The default coordinate system for u would use pressure p for the height
> axis, but availability of vpt is evident.

actually, its explicit.

> 
> ----------------------------------------------------------------------------
> 
> Extensions
> 
> Set Notation
> 
> Looking to the future, the global mapping attribute could adopt some
> specialized syntax so that simple mappings could be included as a formula
> rather than a whole array of data. If the distance variable in the wiring
> example above is proportional to s, then distance-map could describe the
> mapping using set notation.
> 
>         :distance-map = "{ (s) : s*1.6 }"

this could be worth exploring. we'd need a syntax definition.
did i say we couldn't embed methods? 

> 
> Virtual Vector Variables
> 
> Since the global map attribute essentially lists multiple components of an
> n-space range, it could also be interpreted as a vector. Perhaps call it a
> virtual vector variable:
> 
>         float u_wind (time);
>         float v_wind (time);
>         float w_wind (time);
> 
>         :wind = "u_wind v_wind w_wind";         // 3-D vectors over time
> 
> The problem is how to clearly indicate wind as a virtual variable. Perhaps
> in a global vectors or variables attribute.

how about
	:vector_<name> = "list of components";
eg
	:vector_wind = "u_wind v_wind w_wind";         // 3-D vectors over time

although the question of what applications might do with tat info is unanswered.

> 
> Sub-mapping
> 
> It would be interesting to be able to specify mappings which map directly to
> the coordinates of dimensions in the file. Suppose we wanted to specify a
> subset of the ocean temperature grid in the first example. We could specify
> a coordinate system whose range maps a subset of the domain of the temp
> variable over (i,j,k). As one possibility, specify integer variables which
> map into i, j, and k, and a global mapping which uses a colon (':') to
> indicate that the coordinates map directly into the i, j, and k coordinates
> of temp. The same method could be used for specifying the path of a particle
> through the grid:
> 
>         dimensions:
>                 i = 10;         // Whole grid
>                 j = 100;
>                 k = 100;
>                 i2 = 10;        // Region of grid
>                 j2 = 20;
>                 k2 = 20;
> 
>         float temp (time, i, j, k);
>                 temp:temp = "subgrid-map particle-trace-map";
>         float depth(time, i,j,k);
>                 depth:depth = "depth-estimate particle-trace-map";
>         float lat(j, k);
>                 lat:lat = "particle-trace-map";
>         float lon(j, k);
>                 lon:lon = "particle-trace-map";
> 
>         int a (i2, j2, k2);
>         int b (i2, j2, k2);
>         int c (i2, j2, k2);
> 
>         int particle_i (time);
>         int particle_j (time);
>         int particle_k (time);
> 
>         :subgrid-map = "a:i b:j c:k";
>         :particle-trace-map = "particle_i:i particle_j:j particle_k:k";
> 
> Using a "sub-mapping" implies a different domain for the variable's mapping.
> In the case of plotting the water temperature along the particle track, the
> user asks for temp over the particle-trace-map, which an application can
> directly expand to
> 
> temp (particle_i(time), particle_j(time), particle_k(time))
> 
> The particle's locations would be
> 
>  lat (particle_j(time), particle_k(time))
>  lon (particle_j(time), particle_k(time))
>  depth (time, particle_i(time), particle_j(time), particle_k(time))
> 
> which in turn might be described by a "virtual vector":
> 
>         :particle_locn = "lat lon depth";
> 
> If the sub-grid were sampled along each dimension independent of the other
> dimensions:
> 
>         int a (i2);
>         int b (j2);
>         int c (k2);

sorry, my eyeballs glazed over....
i'll try this example again later....

> 
> Close
> 
> I left out details, like the specific syntax of the referential attribute,
> and I'm concentrating specifically on describing coordinates systems and
> mappings. Other conventions could be added later, such as units, projection
> types, labelling of geographic coordinates so that applications can
> recognize them, orientation of coordinates (e.g. vertical vs horizontal),
> and so on.
> 
> I've been working with netcdf since the beginning, especially with regards
> to visualization software (Zebra) which stores and interprets
> multi-dimensional data. Some netcdf conventions for describing these data
> and their geographic coordinates would be very helpful, so I wanted to
> contribute some ideas. If any of my reasoning is wrong, faulty, or unclear,
> please let me and/or the netcdf mailing list know. Thanks for taking time to
> check this out.
> 
> ----------------------------------------------------------------------------
> 
>                                                                 Gary Granger
>                                                                July 12, 1997

i would be interested in examples from real life that you think a convention should cover,
especially if not already listed in http://acd.ucar.edu/~caron/coordvar.html

Regards,

John.