NetCDF Status

Russ Rew
August 25, 2006


NetCDF and Unidata

Work in maintaining, supporting, and developing the netCDF data model and software is associated with Endeavor 6: Improved scientific data access infrastructure from the Unidata 2008 proposal. NetCDF has become a key infrastructure element for data providers and users of oceanographic and atmospheric science data, as well as data in other geosciences.

Recent netCDF development, both at Unidata and at other institutions, aims at generalizing the netCDF data model, improving interoperability with other representations for scientific data, making the netCDF interface more suitable for use on high-end parallel platforms with high-resolution models, and providing netCDF software on a wider range of platforms.

NetCDF-4/HDF5 Development

NetCDF-4 is the name of a major NASA-funded project to implement an enhanced netCDF programming interface on top of NCSA's HDF5 format, preserve the desirable common characteristics of netCDF and HDF5 while taking advantage of their separate strengths: the widespread use and simplicity of netCDF and the generality and performance of HDF5. NetCDF-4 development has continued, with an alpha release of netCDF-4.0 improving support for new platforms and compilers, performance, functionality, and documentation.

NetCDF-4 depends on the still unreleased version 1.8 of HDF5, now tentatively scheduled for release in December 2006, if no unanticipated problems are encountered. The delay in releasing HDF5 1.8 continues to be an obstacle in making netCDF-4 functionality available to developers and users. We met with Kent Yang from the HDF Group in July to discuss parallel I/O performance and get an update on HDF5 1.8 progress.

Other NetCDF Developments

Since the last status report, we made available four beta releases of netCDF 3.6.2, with significant improvements to the netCDF configure and build system, fixes for platform-specific build problems, and several complete example programs demonstrating the C, C++, Fortran-77, and Fortran-90 interfaces in a new netCDF tutorial for developers. Advances in the new version of netCDF-3 include support for shared libraries, support for multiple Fortran compilers on the same platform, C++ interface improvements, and better support for building Windows DLLs.

Work continued on developing the Common Data Model and the netCDF-Java library, adding the ability to read BUFR data through a netCDF interface and a framework for users to plug in their own coordinate transforms.

We mentored a SOARS student, Shanna-Shaye Forbes, this summer to design and partially implement a new C++ interface for netCDF-4. We expect to make use of this work when the implementation is completed during the next year.

This fall we will be developing and presenting two new training workshops, NetCDF for Developers and NetCDF Java. The first workshop will provide an overview of the netCDF data model and architecture, comparison of language interfaces, summary of best practices, discussion of conventions, performance, utilities, visualization and analysis applications, and an overview of netCDF-4. The NetCDF Java workshop will cover the Common Data Model (CDM), working with CDM files, use of NcML, use of the I/O Service Provider framework for reading new file formats into the CDM, and other useful plug-ins for adding support for additional coordinate systems, coordinate transforms, and datatypes. We have scheduled multiple sessions of the new netCDF workshops to accommodate the high interest following their announcement.

In response to an invitation from the NASA Earth Science Data Systems Standards Process Group, we provided a review of HDF5 from our perspective of developers attempting to build the netCDF-4 library using HDF5 as a storage layer. We offered detailed but constructive criticism of the HDF5 standards proposal and support for NASA's ultimate standardization of HDF5.

NetCDF Posters, Papers, and Presentations

At a June Global Organization for Earth System Science Portals (GO-ESSP) meeting at Lawrence Livermore, Russ presented an update on netCDF developments and a plan for collaborative development of a new library to support the widely used Climate and Forecast (CF) conventions for netCDF data. The CF conventions are required for Intergovernmental Panel on Climate Change (IPCC) model outputs archived and made available through the Program for Climate Model Diagnosis and Intercomparison (PCMDI). The GO-ESSP meeting advanced a plan for the continued governance and development of the CF conventions under the auspices of the World Climate Research Programmme's (WCRP's) Working Group on Coupled Modeling (WGCM). Current and future issues that need addressing under the proposed governance structure include conventions for staggered and unstructured grids, GIS information content, ontologies and nomenclature, role of the netCDF-4 data model, enhanced conventions for in situ observations, discovery metadata, and compliance issues. A candidate final version of a white paper, "Maintaining and Advancing the CF Standard for Earth System Science Community Data" describing the governance and development plan in more detail, including Unidata's role, is included in the meeting materials.

A metric for international use of netCDF

A count of the number of downloads of the netCDF software from the Unidata site is not a reliable measurement of netCDF usage for several reasons:

For what it's worth, we have developed a different metric to compare use of netCDF in top-level domains domains (for example .edu, .gov, .com, and country domains). We propose using the number of web pages per million in each domain of interest that include the word "netCDF" (without regard to capitalization), on all web pages indexed by Google. Call this metric nGh/Mwp (number of Google hits per million web pages).

Ranking netCDF usage with this metric on data gathered on August 14, 2006, resulted in the following top twenty usage rankings:

NetCDF usage ranking on August 14, 2006
nGh/Mwp Domain Country nGh Mwp
111..edu(Educational)3090002780.0
73..frFrance40300550.0
55..gov(US Government)1080001970.0
53.(all)(Whole Web)133000025270.0
52..auAustralia15900305.0
42..chSwitzerland11300270.0
41..itItaly14600356.0
36..net(Network)437001200.0
34..nlNetherlands9520277.0
29..org(Organizations)1430004860.0
29..jpJapan23400798.0
22..yuYugoslavia29113.1
21..skSlovak Republic83640.6
19..caCanada10400544.0
16..ukUnited Kingdom211001350.0
15..deGermany212001420.0
12..siSlovenia27221.9
12..roRomania63651.8
11..grGreece52445.1
9.8.dkDenmark1270129.0

A more complete ranking of the top 71 domains using the nGh/Mwp metric is available at http://www.unidata.ucar.edu/staff/russ/status/nc-usage.html, which also includes enumeration by subject area of published books that mention netCDF.

This metric may also be used to track whether the mention on Web pages of netCDF is growing in various sectors. The chart below plots the growth in the .edu and .gov domains since March, 2004.

NetCDF usage growth in .edu and .gov
sectors since March 2004

by Russ Rew