This document was completed 3/20/95 by Peggy Bruehl of Unidata. This document was updated 4/21/95 by Peggy Bruehl of Unidata. Discussion of netCDF Conventions for Georeferenced Gridded Data NUWG Meeting 2/15/95 Peggy Bruehl Introduction: This document is intended to be used in conjunction with sample gridded data CDL's: Sample Georeferenced Gridded Data CDL's
History: At the 2/15 NUWG meeting, I presented a summary of our NUWG gridded data netCDF conventions. A few unresolved issues regarding these conventions recently arose during email traffic on the NUWG mailing list. In addition, I am trying to finish up a project for adding a netCDF interface for gridded data to GEMPAK, so I had some questions too. I presented the following summary, in an effort to get 'approval' by all NUWG committe members present, and to hammer out some solutions to the remaining issues. We were reasonably successful, and the summary of our discussion is below. I've added some verbage from Susan Jesuroga's document "General Conventions approved by NUWG" in order to flesh out some of the topics discussed. I'm very open to comments and suggestions about the presentation and content of the material below. Also, if I've misrepresented any discussions, topics, or decisions in the following document, please let me know so I may make any necessary corrections. NUWG netCDF Conventions for Georeferenced Gridded Data General Summary In general, the NUWG has decided to require the following minimum set of conventions for data exchange: 1) No naming conventions for variables This allows the author of the netCDF full discretion for naming variables. It is recognized by the NUWG that truely generic software for reading and accessing netCDF data can not be variable name dependent and will require usage of the netCDF inquire funcitons or will employ some sort of table look-up algorithm. 2) No type conventions Exception: time variables must be of type double. This is required in order to support a variety of time scales and precisions. 3) Variables must have a "long_name" attribute This convention is for self documentation, and is particularly important because of our decision not to implement specific naming conventions. The "long_name" attribute will provide the human recipient of the netCDF file, a description of the quantity, regardless of its name. 4) Where appropriate, variables must have a "units" attribute The "units" attribute must conform to the UDUNITS library and be consistent for a given variable. By consistent, we mean if temperatures are stored in a given variable "temp", and the units attribute is set to "degF", all the temperatures must be in Fahrenheit. There are some quantities which do not have units, and for those a "units" attribute is not required. The "long_name" attribute is very important in this case. The UDUNITS distribution is freely available via anonymous FTP from the Unidata Program Center unidata.ucar.edu in the file pub/udunits/udunits.tar.Z. Included in this distribution is the udunits database, udunits.dat, which contains a list of all acceptable units strings. 5) Other attributes follow netCDF documentation For example, _FillValue, valid_range, scale_factor, add_offset, etc. A good example of how these attributes may be implemented is contained in the document "Conventions for the standardization of NetCDF files" sponsored by the Cooperative Virtual Data Center (http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html). Again, we stress that all the NUWG conventions should be considered a MINIMUM set of conventions for data exchange. The data provider is always free to add additional fields. The rest of this document will concern three special topics: time, vertical coordinates and levels representation, and navigations issues. Special Topics: (1) Time The NUWG agrees that all time fields will be represented such to be compatible with the UDUNITS library. We agree that time should always be represented as a double, even if it is not required by the precision of the data. This gives the most flexibility to the widest range of data sets. Initially, our gridded data conventions were designed to support model output grids. Model output grids generally require two times to define the grid, a reference time (T0) and a valid (or forecast) time. Later, we broadened our description to support not only model output grids with reference and valid times, but also simple grids requiring only a single time. Here is our generic convention for representing time in a gridded data file: double reftime(record); // reference time of the grid reftime:long_name = "reference time"; reftime:units = "hours since 1992-1-1"; :record = "reftime" The particular problem associated with storing multiple model output grids in a single netCDF file is the need to specify time as the unlimited dimension. This is complicated by having a reference time and valid time associated with each grid. In our NUWG conventions, we introduced the concept of a "record" to be the unlimited dimension. The goal is to define referential attributes of a variable such as: double reftime(record); // reference time of the model reftime:long_name = "reference time"; reftime:units = "hours since 1992-1-1"; double valtime(record); // forecast time (for which model //is valid) valtime:long_name = "valid time"; valtime:units = "hours since 1992-1-1"; :record = "reftime, valtime" ; // "dimension attribute" -- means // reftime and valtime uniquely // determine record This indicates that the "nth" record is defined by the "nth" value in both the "valtime" and "reftime" variables. They contain the actual times for the grid. This allows us to refer to both the valid and reference time as a single unlimited dimension. This convention, when applied to grids requiring only a single time, results in the time convention shown above. Summary of conventions: 1) Time variables are always type double 2) Time variables are indexed by another variable (can be the unlimited dimension) 3) The names given to the time variables and the indexing variables are not subject to convention 4) This convention supports grids requiring any number of times to fully describe the data Special Topics: (2) Vertical Coordinate Systems/Levels The simple case of a grid defined in a single vertical coordinate system, by a list of set levels, can be represented as: dimensions: z = 20; variables float T(record, level, y, x) ; T:long_name = "temperature" ; T:units = "degK" ; float level(level) ; level:long_name = "level" ; level:units = "hectopascals" ; data: level = 1000, 950, 900, 850, 800, 750, 700, 650, 600, 550, ... This convention is representative of the way gridded data defined in traditional vertical coordinate systems (pressure, potential temperature, height, etc...) should be stored. Occasionally, a grid may require two levels to fully specify the data. In this case, the referential concept introduced in the time conventions can be useful. An example of this is a grid defining a quantity over a layer, such as the boundary layer. dimensions: bndry = 3; variables: float T-bndry(record, bndry, y, x) ; T-bndry:long_name = "temperature in boundary layer" ; T-bndry:units = "degK" ; :bndry = "bndry_bot, bndry_top"; // (bndry_bot, bndry_top) uniquely // determine bndry float bndry_bot(bndry) ; bndry_bot:long_name = "bottom level of boundary layer between 2 levels at specified pressure differences from ground to levels"; bndry_bot:units = "hectopascals"; float bndry_top(bndry) ; bndry_top:long_name = "top level of boundary layer between 2 levels at specified pressure differences from ground to levels"; bndry_top:units = "hectopascals"; data: bndry_bot = 0, 60, 150; bndry_top = 30, 90, 180; Finally, the referential concept is also useful for describing hybrid grids. Hybrid grids contain a relative vertical coordinate "z". Here's an example: the value of the u component of the horizontal wind at a given grid point may be described by a vertical level in the pressure coordinate, the value of which is stored in variable "p", or it may be described by a vertical level in the virtual potential temperature coordinate, the value of which is stored in variable "vpt". The referential attribute is used as follows: float u(record, z, x, y ); u:z = "vpt, p"; u:long_name = "u component of horizontal wind"; u:units = "meters/second"; This specifies that for grid point u(1,2,1,1), the value of the "z" dimension (vertical level) can be found in p(1,2,1,1) or in vpt(1,2,1,1). Summary of conventions: 1) When necessary, a referential variable can be used as an index into associated variables 2) This referential indexing is indicated by a variable or global attribute with the same name as the dimension Special Topics: (3) Navigation Of all the special topics, the conventions concerning navigation are the least mature. Thus far, we have agreed that the navigation information associated with a grid will be stored in a suite of navigation variables. These variables are grouped together into a psuedo-structure. Each element of the pseudo-structure is dimenensioned by the "nav" dimension. In addition, there is a variable, "nav_model", which describes the source of the navigation parameterization and provides the context in which to interpret all the other variable names. For example: char nav_model(nav, nav_len) ; // navigation parameterization nav_model:long_name = "navigation model name"; ... For GRIB-centric data, its value is: nav_model = "GRIB1" ; but for parameterizations based on the Federal Geographic Data Committee Content Standards for Digital Geospatial Metadata it could be: nav_model = "FGDC-1994" ; and for parameterizations based on the geo-TIFF model, it might be nav_model = "geo-TIFF version 1" ; Notice that it is possible to use multiple navigation parameterizations within the same netCDF file with this mechanism. We haven't yet agreed on how many values of nav_model our generic applications should support, but we want to support at least "GRIB1". We will be discussing this in upcoming meetings. For the case of "nav_model" = GRIB1, we take the navigation parameterization from the GRIB Edition 1 document by John Stackpole in the section on the GDS (Grid Description Section) octets 7-44. This document is available as text (abbreviated version) and as a PostScript file. The actual set of variables needed for any given navigation parameterization will depend on that navigation. For example, the variables needed to describe a polar stereographic grid are different than the variables needed to describe a simple lat/lon grid. In the case of GRIB Edition 1 navigation parameterizations, each suite of navigation variables must contain a numeric ID containing the grid identification number, and an indication of the originating center, both assigned by the GRIB Edition 1 document. Missing values may be used if the particular grid is not described by the GRIB document. For example, the navigation variables "grid_number" and "center_id" may be used. Each variable that is defined on the grid must have the "navigation" variable attribute associated with it. The string defined in this attribute gives the name of the dimension by which all navigation variables are dimensioned. In this way, the "navigation" groups all the navigaton variables together (in the same sense that a structure groups quantities of varying types together). The "navigation" attribute also indicates which variables (temperature, pressure, etc.) in a netCDF file are defined on the grid. Here is an example taken from the CDL used by Unidata to hold the RUC MAPS model. The RUC MAPS model output is defined on a Lambert Conic Conformal grid projection. Note that the ordering of the dimensions which define the grid variables (x,y) or (lat,lon) are not subject to convention, nor are the names assigned to these dimensions. However, the names assigned to these dimensions are defined by the navigation variables "x_dim" and "y_dim". dimensions: x = 93; y = 65; nav = 1; // For navigation. Variables that use // this dimension define a mapping between // (x,y) indices and (lat, lon) coords. nav_len = 100 ; // Max length for navigation character strings variables: float Z(record, level, y, x) ; Z:long_name = "geopotential height" ; Z:units = "gp m" ; Z:navigation = "nav" ; char nav_model(nav, nav_len) ; // navigation parameterization nav_model:long_name = "navigation model name"; char grid_type(nav, nav_len) ; grid_type:long_name = "GRIB-1 grid type" ; char grid_name(nav, nav_len) ; grid_name:long_name = "grid name" ; short grid_number(nav) ; grid_number:long_name = "GRIB-1 catalogued grid number" ; long center_id(nav) ; center_id:long_name = "WMO centers table"; char earth_shape(nav, nav_len) ; earth_shape:long_name = "assumed earth shape" ; char x_dim(nav, nav_len) ; x_dim:long_name = "x dimension name" ; char y_dim(nav, nav_len) ; y_dim:long_name = "y dimension name" ; short Nx(nav) ; Nx:long_name = "number of points along x-axis" ; short Ny(nav) ; Ny:long_name = "number of points along y-axis" ; float La1(nav) ; La1:long_name = "latitude of first grid point" ; La1:units = "degrees_north" ; float Lo1(nav) ; Lo1:long_name = "longitude of first grid point" ; Lo1:units = "degrees_east" ; byte ResCompFlag(nav, rescompdim) ; ResCompFlag:long_name = "resolution and component flags" ; float Lov(nav) ; Lov:long_name = "orientation of the grid" ; Lov:units = "degrees_east" ; float Dx(nav) ; Dx:long_name = "x-direction grid length" ; Dx:units = "km" ; float Dy(nav) ; Dy:long_name = "y-direction grid length" ; Dy:units = "km" ; byte ProjFlag(nav) ; ProjFlag:long_name = "projection center flag" ; float Latin1(nav) ; Latin1:long_name = "first intersecting latitude" ; Latin1:units = "degrees_north" ; float Latin2(nav) ; Latin2:long_name = "second intersecting latitude" ; Latin2:units = "degrees_north" ; float Dj(nav) ; Dj:long_name = "j grid increment" ; Dj:units = "degrees" ; data: nav_model = "GRIB1" ; // Navigation grid_type = "GRIB1 Lambert conformal" ; grid_name = "AWIPS grid 211: Regional CONUS" ; grid_number = 211 ; center_id = 7 ; earth_shape = "oblate spheroid (IAU 1965)" ; x_dim = "x"; y_dim = "y"; Nx = 93; Ny = 65; La1 = 12.190; Lo1 = 226.541; ResCompFlag = 0,0,0,0,1,0,0,0; Lov = 265.0; Dx = 81.2705; Dy = 81.2705; ProjFlag = 0; Latin1 = 25.0; Latin2 = 25.0; Summary of Conventions: 1) Navigation information is stored in variables and dimensioned by the value of the variable attribute "navigation" 2) All grid variables have the "navigation" attribute 3) The "nav_model" variable defines a string containing a key word which indicates the source of the navigation parameterization. Thus far, the NUWG has officially accepted "GRIB1" as the key word for navigation parameterizations defined in the GRIB edition 1 document GDS octets 7-44 "Grid description" (Table C). This document defines the content of the navigation parameterization, not necessarily the explicit variable names to be used. 4) For GRIB1 parameterizations, a numeric ID listing the Grid Identification number and an originating center ID from the GRIB edition 1 document must be included in the navigation variables. Missing data values are OK for grids not described by a GRIB document. 5) Ordering or naming of grid dimensions not subject to convention. Dimensions defining grid variables defined by "x_dim" and "y_dim" navigation variables