NOTE: The netcdf-hdf
mailing list is no longer active. The list archives are made available for historical reasons.
Hi, I'm forwarding this summary of the Langley meeting on the "National Grid Project" that occurred during our workshop, for background information. --Russ Introduction: On May 1, 1992 there was a meeting at NASA Langley to discuss possible choices of a library and application programming interface for reading/writing/exchanging scientific data for use in the National Grid Project at MSU. In this case, the area of interest was mesh information and variables on a mesh. Another purpose of this meeting was to participate in and contribute to the ongoing efforts to establish a defacto scientific database library. The people attending were: Stewart Brown LLNL - PDBLib Linnea Cook LLNL Mike Folk NCSA - netCDF/HDF merge Adam Gaither NGP Chris Houck NCSA - netCDF/HDF merge Robert Jackson NARL Jeff Long LLNL - SILO Michael McLay NIST Bob Weston NASA Langley I (Linnea Cook) agreed to write up the results of this meeting. The results and action items are in the section with that title which follows later. I have also included the following background section to define all the acronyms used and to give some context for the decisions which were made. Most of this background information was covered during the Friday meeting at Langley. Background: For some time there has been interest and work in the scientific community for having a standard library and application programming interface for reading/ writing/exchanging scientific data. Three established defacto standards for this are: o HDF (Hierarchical Data Format) from NCSA (National Center for Supercomputer Applications), o netCDF (network Common Data Form) from Unidata Program Center and o CDF (Common Data Form) from NASA Goddard. CDF and netCDF are very similar; in fact, netCDF is a spin off from CDF. Several recent important steps have occurred to further the establishment of a single, more widely used I/O library for scientific data. NSF has recently decided to fund NCSA to put the netCDF interface on top of its HDF library and to convert all of their public domain tools (Image, Layout, etc.) to use the netCDF interface. NCSA is in touch with Unidata in regards to this merge project. This should help to unify two widely used standards. The Earth Observing Systems project (EOS) is expected to receive $3 billion in funding over the next decade. The Earth Observing System Data Information System (EOSDIS) part of the EOS project has selected NCSA's netCDF/HDF merge product as its scientific data I/O library. NCSA does not know how many people currently use HDF. However, when the latest version of HDF was released to its users, 2500 people downloaded a copy of this library within the first month. As part of this netCDF/HDF merge project, Mike Folk (who is in charge of HDF and the netCDF/HDF merge project) is also interested in some work which has been done by Jeff Long at Lawrence Livermore National Laboratory (LLNL) on the SILO library. SILO is a library which implements an application program interface for reading and writing scientific data. SILO uses the calling sequence of the netCDF library but has made two extensions to the netCDF interface - objects and directories. Objects allow a set of related variables and other data to be grouped together. Directories in SILO allow the user to structure a database into a hierarchy that is analogous to a UNIX file system. It is these two extensions to netCDF (objects and directories) which NCSA is interested in including in its netCDF/HDF merge. At a meeting between NCSA and LLNL it was determined that the SILO extensions appear to be very compatible with the NCSA merge of netCDF and HDF. NCSA wants to add these extensions and will do so provided their user community approves and provided they have funding to do this work. NCSA estimated that it will take them one month to finish the prototype version of the netCDF/HDF merge work. They would want to allow six months to do the SILO extensions once this work is started. Russ Rew (who leads the netCDF project at Unidata) and Jeff Long (the author of SILO) are currently corresponding to refine the SILO extensions to netCDF. They hope to agree upon these extensions and cooperate with NCSA and Unidata so that the same extensions are put into both the netCDF/HDF merge and into netCDF. Two other topics were mentioned but not resolved at the NCSA / LLNL meeting. These topics were the `standard' definition of some objects and the use of a socket library interface for reading data across a network. The SILO object extension (by itself) allows users to define their own objects but does not assign meaning to the objects. SLIDE (a companion library to SILO) has, however, defined objects for mesh data commonly found in physics simulations. An example is a `quadmesh' (quadrilateral mesh) - this object must include the dimension and coordinate data and also typically includes the mesh's labelling and unit information. NCSA and LLNL thought it was desirable to use these mesh object definitions as a starting point for the scientific community to define `standard' mesh objects for use in the netCDF/HDF merge product. However, since NCSA is not yet funded to do the two primitive extensions (objects and directories) it is somewhat premature to plan this. HDF currently has a socket library interface for reading HDF data across a network. SILO has a similar interface for reading SILO objects across a network. In both cases the code which uses HDF or SILO does not know whether the data is coming from a disk file or a network connection. NCSA and LLNL think it may be possible to combine the two socket libraries some time in the future since each addresses a different data type but that it was premature to evaluate this now. The PDBLib (Portable Database Library) scientific database library is of considerable interest to the National Grid Project because of its speed and flexibility. PDBLib is similar to HDF in that both the library and the file it produces are portable. One major difference between PDBLib and HDF is that PDBLib allows the user to define C-like structures, then read and write these structures in one operation. The structures can contain primitive data, pointers to primitive data, other structures, and pointers to other structures. PDBLib also has a more general conversion model - it can write in native format, then read that on any other machine. Or, it can create a file on one machine in any other machine's format. HDF can read/write data in a machine's native format but can not move this file to any other machine which uses a different format. HDF also can read/write IEEE format on any machine - this IEEE format file is portable to any computer. PDBLib was developed at LLNL by Stewart Brown. The SILO interface is currently implemented on top of PDBLib. Results and Action Items from the May 1 Meeting with the National Grid Project: The following consensus was achieved during this meeting: 1. The National Grid Project would like to use the netCDF/HDF merge product being developed at NCSA and is also interested in the SILO extensions and SLIDE object definitions. 2. The attendees agreed to send Mike Folk requirements for HDF - some of this will be based on experience with the current HDF library. Also, there was concern that there is functionality in PDBLib which is not in HDF. Stewart Brown will be sending out the PDBLib manual to interested parties. Those not familiar with PDBLib said they would examine its functionality and provide Mike with feedback on what they saw as additional capabilities which they would like from the PDBLib capabilities. All of this should be written input. 3. Since NCSA is interested in adding the SILO extensions to their netCDF/HDF merge product, the people at this meeting agreed to examine the SILO extensions and get back to Mike Folk with their feedback on these extensions (opinions, changes, additions). These two SILO extensions are the directory and object primitives. This should be written input. 4. Mike Folk will send the SILO extensions out to his user community for feedback. 5. Part of the SILO document includes the definition of higher level mesh objects on top of the netCDF interface (with the two SILO primitives). We agreed to look at these objects as a possible starting point for defining 'standard' mesh objects for use across many sites. 6. Mike Folk agreed to send us copies of all written input sent to him by this group to us (with possibly some editing). He also agreed to put together a list of additional requirements for the netCDF/HDF merge product and send them to us. 7. The next step which was mentioned was that NCSA would need funding to do additional work beyond the netCDF/HDF merge. 8. The netCDF/HDF merge product will be available as a beta product by the end of July and as a fully released product by May 1993. The National Grid Project will need a library sooner than that. The SILO/netCDF interface on top of PDBLib may be used as an interim solution allowing the netCDF/HDF merge library to replace it with little or no change to the application programming interface. Other Information: One desire expressed at this meeting was for a way of writing data to disk without going through the translation to IEEE floating point format and being able to read this data and translate it later, if necessary. Another desire was to be able to write a code's internal data structures directly to disk. Some subsequent discussion indicated that being able to write data quickly and with little overhead (little extra information written to disk) was the basic requirement. Another part of this requirement seems to be the ability to write any data to disk without first getting an 'approved' tag or data type implemented. This was for use during the development stage. All agreed that eventually the tags would be officially requested, granted and documented. Since the issue of writing a code's internal data structures directly to disk received considerable comment, we should specifically address this in our feedback to Mike Folk. Related to this is the question of whether it is important to be able to use other tools (such as graphics codes) to read (and display or do other operations on) this data? Adam Gaither from NGP volunteered to be a contact point for disseminating information to this group. NGP is another good forum for pushing a standard scientific database library and Adam wants NGP to help with and participate in this process. Michael McKay from NIST talked about several other related standards in the scientific community (STEP, IGES, OMG, OSI). He will be forwarding information on relevant standards to the attendees. Unless requested otherwise, all attendees were added to NCSA's mailing list for the discussion of the netCDF/HDF merge project. Send mail to hdf-netcdf- request@xxxxxxxxxxxxx if you wish to be removed. If you have any questions, corrections or additions, please call me at the phone listed below. Or you can send email to me though Jeff Long. I will send out my own email address as soon as I have one. My name and address were left off the list which Adam Gaither sent out, so if you would please add my name and address to your copy of the list I would appreciate it. -------------------------------------- Linnea Cook Lawrence Livermore National Laboratory B Group Leader P. O. Box 808, L-35 Livermore, California 94550 Desk: 510-422-1686 FAX: 510-422-3389 E-Mail: (you can reach me through Jeff Long's E-Mail) jwlong@xxxxxxxx
netcdf-hdf
archives: