I am about to commit a pull request for the netcdf-c library having to
do with identifying the provenance and format of netcdf-4 files,
and specifically targeted at detecting netcdf-4 files from HDF5 files.
This provenance consists of the following information. First, there is
a hidden, persistent, attribute names _NCProperties. It specifies the
library versions of the netcdf library and the hdf5 library used to
create the file. This attribute never changes during the lifetime of the
file (unless modified deliberately thru the hdf5 API).
Second, there are two special, non-persistent, attributes that are
computed from information already in the file.
1. _SuperblockVersion
2. _IsNetcdf4
Non-persistence means these attributes do not actually appear in the
file. and are computed from other info already in the file.
The _SuperblockVersion attribute is a single integer giving the version
number (currently 0-3) of the superblock in the hdf5/netcdf-4 file.
The _IsNetcdf4 attribute is a single integer 0/1 indicating if the file
has various tags indicating it was produced thru the netcdf-4 API. This
is computed by using the HDF5 API to walk the file to look for
attributes specific to netcdf-4. False negatives are possible for a
small subset of netcdf-4 files, especially those not containing
dimensions. False positives are (I think) only possible by deliberate
modifications to an existing HDF5 file thru the HDF5 API. For files with
the _NCProperties attribute, this attribute is redundant. For files
created prior to the introduction of the _NCProperties attribute, this
may be a useful indicator of the provenance of the file.
These three attributes are hidden in the sense that they can only be
accessed thru the netcdf-C api calls via the name. They have no
attribute number and will not be counted in the number of global
attributes in the root group.
The simplest way to view these attributes is to use the -s flag to the
ncdump command.
Comments are welcome.
=Dennis Heimbigner
Unidata