[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #RZN-157444]: NetCDF technology roadmap



Hi Randy,

Hi Randy,

> I am a system engineer working on the GOES-R Ground Segment. This is the
> next-generation GOES system that goes operation around 2015. The plan
> is to use the NetCDF file format and API for the GOES-R imagery and
> environmental variable product files.
> 
> I have some general questions about the technology roadmap for NetCDF.
> The current version is NetCDF4.
> 
> (1) Is there a NetCDF5 being developed ?

No, not currently.  We have another format variant we are
experimenting with for streaming netCDF data, but if we release it, it
will still be in a netCDF 4.x version.  Also the "Parallel netCDF"
project at Argonne/Northwestern has developed a netCDF format variant
using 64-bit sizes of things (dimensions, dimension lengths,
attributes, etc.) and have named the format "netcdf-5", but we haven't
integrated it into our reference distribution.
 
> (2) If there is a NetCDF5 in works, when will the Java API be available ?

The Java API typically preceeds the C/Fortran API in terms of
features.  The new streaming experimental format for netCDF is
currently only being developed in Java.
 
> (3) Can you provide us with your technology roadmap ?

This is a small project (about 3 software engineers, on and off, each
also working on other projects), and the roadmap is relatively
informal.  So our current plans are fluid, and periodically get
revised, especially if we get proposals for further development
funded.  However, here's a snapshot of our current plans:

=== Planning for Data Access Infrastructure 2010-2014 (DRAFT) ===

Over the next 5 years, Unidata's data access infrastructure efforts
will be challenged to

* Manage a graceful transition from a simple data model (netCDF-3) to
  the enhanced Common Data Model of netCDF-4 
* Provide better support for remote access and server-side data
  manipulation 
* Respond to the need to faithfully represent observational data as
  well as gridded data 
* Scale up to handle larger volumes of data efficiently
* Provide effective support to a larger netCDF user community as users
  of satellite products, GIS, and other analysis and visualization
  software make use of growing archives of netCDF data. 

'''Target audience:''' Unidata will develop and maintain data access
infrastructure for modelers, data providers, tool developers, and
users of data to support the domestic and international geosciences
research, education, and operational communities.

==== NetCDF and the Common Data Model ====

The 5 year strategy for netCDF is to improve the functionality,
performance, documentation, and utility of netCDF software, while
providing adequate support, backward compatibility, and
interoperability with other standards. Refinement and full
implementation of the Common Data Model will guide development,
supporting the coordinate systems and scientific features layers in
C-based APIs, reconciling differences in C and Java data models, and
improving support for conventions compliance for data providers.

===== 6 months: =====

* Create netCDF-4 test file collection to support netCDF-4 tool
  developers, with more nested type tests (compound, vlen, and array).
* 4.2 Windows port
* Prepare and run netCDF workshops in October 2010.
* Write guide and FAQ section for netCDF-4 chunking and compression.
* NCDC project: NOMADS/NCMP TDS communication with CLASS OPeNDAP
  server (working with NCDC personnel)
* NCDC project: aggregation of long time series from model outputs
  (NcML for 10 years of Climate Forecast System Reanalysis (CFSR)
  monthly means of single variable, GFS, NAM)
* Implement and test netCDF simple record streaming as specified in
  standards RFC submission.
* Migrate netCDF file and program examples, contributed programs, and
  current distribution sources to RAMADDA netCDF Group repository.
* Investigate h5repack utility, nccopy enhancements for modifying
  chunking and compression of existing data
* Enhance ncdump to optionally generate NcML output for the netCDF-4
  files.
* Enhance ncdump to support selection of specified groups.
* Enhance ncgen4 to generate Fortran-90 output for netCDF-4 files.
* Incorporate contributed pnetcdf format variant exploiting 64-bit
  size_t, permitting dimensions larger than 2<sup>32</sup> and no
  practical limits on variable size
* Update
  [http://www.unidata.ucar.edu/netcdf/papers/nc4_conventions.html
  Developing Conventions for netCDF-4] document and submit what's
  ready to CF
* NCDC project: IOSP work for ON29 BUFR, GRIB2 binary, IEEE binary
* Release netCDF 4.2 C-based libraries, windows port, new netCDF-4 C++
  API, bug fixes, support for tool developers
* Disciplined [[9989ec31-9a6b-4943-8f79-4c16c3d9b0e0|thread safety in
  C-based interfaces]], assumes each thread opens file(s) separately
* Upgrade Fortran support for F03 features, especially use of
  Fortran-to-C interfaces.
* Provide examples that use new observational conventions.

===== 1 year (July 2011): =====

* Help define usable partitioning and filtering representations for
  DAP4.
* NCDC project: support CFSR intercomparison with CMIP5
* Define canonical architectures for streaming infrastructure for CDM.
* Design and document hooks in C-based netCDF to allow plug-in
  extensions for other formats, as supported by IOSP interfaces in
  netCDF-Java.
* Release C-based DAP client implementation that will handle the full
  enhanced data model of netCDF-4.
* Make all examples generate CF-compliant files.
* Update and enhance attribute Users Guide conventions for netCDF-4.
* Enforce that all string values must be UTF-8, checked when stored.
* Hoist ncdump, ncgen, nccopy utilities to make use of common library
  for in-memory schema and for in-memory data block access that works
  for variables of any shape
* Investigate extending types supported by classic format to include
  unsigned and 64-bit numeric types, a bounded-length string type, and
  an opaque type, using plug-in extensions.
* Investigate whether to add remote access via http range facility to
  C-based software, as in netCDF-Java
* Consider allowing user-jammed insertions at beginning of netCDF-4
  files, like HDF5, of length 2<sup>9</sup>, 2<sup>10</sup>,
  2<sup>11</sup>, ...
* Register with IANA a MIME access type for netCDF, following RFC4288
  and RFC4289 process.
* Implement interpretation of the _Unsigned variable attribute when
  doing conversions of byte data.
* NCDC project: IOSP work on additional formats (GHCN-monthly,
  GHCN-daily, USHCN, IGRA, ICOADS, IBTrACS, ...)
* Utility or API to provide sizeof and offset_of for members of
  "dynamic" compound types, with member size determined at run-time
  (needed by Lynton Appel and others).

===== 2-3 years (June 2011 - June 2012): =====

* Support automatic data packing using _Scale and _Offset attributes
  within C library.
* Finish C++ API, including use of exceptions, templates, and name
  spaces and full support for the CDM.
* Support HDF5 references and thus reading NPOESS satellite data
  through netCDF interface.
* With other OPeNDAP developers, investigate addition of asynchronous
  DAP facilities
* Support enhanced streaming for optimized read access to netCDF
  subsets requested by coordinate system bounding boxes.
* Add netCDF error info to the HDF5 error stack.
* Add group support for classic format by using "/" character in
  names.
* Create language plug-in framework for ncgen, to support output of
  Java, C++, Python, etc.
* Modify ncgen to generate reading as well as writing code.
* Support read-only access through NcML in C-based APIs, including
  support for subsetting, augmentation, and aggregation.
* Add bit-packing support for n-bit data through HDF5 filter.
* Support reading other CDM data (e.g. GRIB) through plug-in
  extensions for netCDF C-based API.
* Write white papers on data access infrastructure, providing Unidata
  perspective, guidance, and leadership on specific issues.

===== 5 years (June 2014): =====

* Make the ncgen utility read NcML as well as CDL.
* Improve integration with parallel netCDF for classic format files,
  providing merged parallel interface for netCDF-4 files also.
* Make testing for other language APIs as thorough as C-based tests.
* Continue to improve access to geoscience data to make it easy,
  reliable, flexible, and efficient.

==== Metadata conventions and libcf ====

Work must continue on developing and promoting open standards,
conventions, and protocols for publishing and accessing data and
metadata to enhance interoperability. Development and deployment of a
library facilitating the creation and maintenance of CF-compliant data
is a high priority. The library should support access to CF-compliant
data that takes advantage of the CF metadata, providing data access by
coordinate referenced bounding boxes, for example. Some of the
advanced features in the Java interface, such as support for the
scientific features layer, should be in libcf rather than the netCDF
libraries.

===== 6 months: =====

* '''Done:''' Develop and submit an RFC for CF Metadata Conventions
  for the NASA Earth Science Standards group, referencing web site
  documents.
* '''Done:''' Integrate Balaji's GridSpec library into libcf.

===== 1 year (December 2010): =====

* Develop and implement the Princeton API for GridSpec for use in
  libcf
* Help AR5 modeling data centers come online with GridSpec
* Get approval for CF Conventions extensions for CF observations data
  and coordinate reference systems.
* Implement CF observations data extensions in libcf.
* Implement coordinate reference systems extensionsin libcf.
* Move the effort forward to agree on a CF convention for model
  ensembles.
* Move time-date handling to libcf, from udunits. Handle climate
  calendars according to the CF spec.
* Development of new F90 API to better support data producers.
* Support CF cell methods.
* Work with modeling groups to get feedback on libcf functionality.
* Use package such as GDAL for implementing projection library support
  in libcf.
* Create compliance testing utility that uses libcf.

===== 2-3 years (June 2011 - June 2012): =====

* Enhance libcf with support for namespaces and standard names.
* Provide iterator-like interfaces to C-based software
* Provide iterator-like interfaces to C-based software for access to
  variable-length data types for access to
* Provide iterator-like interfaces to C-based software for access to
  variable-length data typesvariable-length data types
* Support CF conventions for model ensembles.

===== 5 years (June 2014): =====

* Support libcf as a comprehensive reference implementation for
  evolving CF conventions, including support for structured grids, new
  observational data conventions, standard name attributes for
  quantities, climate modeling calendars, and interoperability with
  other data models.

Other areas and special projects

* Provide support for use of netCDF in CMIP5 model output standards
  for next IPCC assessment
* Provide support to IOOS ocean community in use of netCDF+CF+OPeNDAP,
  as resources permit
* Provide support for developing GOES-R ground segment product
  standards.
* Provide support to Matlab and IDL developers for integrating
  netCDF-4 into Matlab and IDL
* Provide feedback and support to HDF5 developers in testing releases

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: RZN-157444
Department: Support netCDF
Priority: Normal
Status: Closed