[netcdf-java] GRIB variable name changes in 4.3

To all:

The CDM / netCDF-Java library version 4.3 (and also TDS version 4.3) is considering a radical change in the way that GRIB variables are named. Instead of nice human readable names like

   float Temperature(time=1, lat=361, lon=720);

they are now like

   float VAR_0-0-0_L6_I6_Hour_S194(time=1, lat=361, lon=720);

with "human readable names" in the long_name:

     :long_name = "Temperature (6_Hour Average) @ Maximum wind level";

The reasons for this change are that the "nice human readable names" come from external GRIB tables, that is, the names are not in the files themselves. GRIB table parameter names have no requirement to be unique nor simple nor unchanging, i.e. they have no requirement to be suitable as netCDF variable names. Maintainers of GRIB tables often make minor changes to GRIB names, correcting typos or otherwise improving the readability of the name. In some cases, the GRIB names are completely changed. When the CDM starts to use new versions of the tables, the variable names can and do change. Since calls to access data use the name of the variable, many things break if the name changes.

Any GRIB to netCDF translation software is in the position of either hand-maintaining the tables to prevent names from changing (and fixing duplicates or unsuitable names), or doing something else. Hand maintaining GRIB tables is not a viable option due to resource constraints. The something else is to give variables unique names based only on the information actually in the file. The NCL package has adopted a similar scheme:

http://www.ncl.ucar.edu/Document/Manuals/Ref_Manual/NclFormatSupport.shtml#GRIB

More background on this problem is here:

http://www.unidata.ucar.edu/staff/caron/papers/GRIBarchivals.pdf

Another aspect of this problem is that errors were found in version 4.2 with GRIB tables, with handling GRIB time intervals and ensemble data, as well as with the algorithm for generating names when multiple variables from the same parameter are in the same file. About 1 in 5 variable names (in the NCEP IDD data) need to change to fix these problems. In reviewing how variable names are created, and how GRIB tables are handled, these other problems became clear. Rather than fixing the problem piecemeal, we are trying to make one big change all at once, then do our best to not let this happen again.

The main impact this will have is probably on:
1) scripts or IDV bundles that have a GRIB variable name in them; hopefully a one-time change will fix this. 2) interactive applications that are built on top of the CDM. For GRIB, users will need to see the long_name, not the variable name, to know what they want. However, the CDM presents a uniform interface for all files, not just GRIB, so the application can't assume that the long_name is even present. So the application should present both the variable name and the long_name (if it exists) to the user when selecting variables.

We think that this change, though painful, is a necessary way forward, but we want to get input from users, and especially application developers. The latest 4.3 snapshot has these changes, please try it out and let us know what you think, and how it will affect you. Post your comments to these email lists so the entire discussion can be public.

thanks

John, Ethan



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: