Hmmm. Is there any big reason NOT to try to read a netCDF produced HDF5 file
with the native HDF5 library if somene so chooses?
As far as detecting the data producer goes, I have a similar problem with my
Silo library. Silo can write to HDF5. It can also write to PDB (Thats 'Portable
Databse', https://wci.llnl.gov/codes/pact/pdb.html) not Protien Database).
And, attmpeting to read an HDF5 file produced by Silo using just the HDF5
library (e.g. w/o Silo) is a major pain.
To handle detection of Silo/HDF5, Silo/PDB, there are a couple of things I do.
First, augment the Linux 'file' utility calling it 'silofile'...
#!/bin/sh
#
# Use octal dump (od) command to examine first few bytes of file.
# If do not find expected bytes of any of the formats we'd like
# to identify here, fall back to using the good ole' file command.
#
for f in $*; do
if test -f $f; then
headerBytes=$(od -a -N 10 $f)
if test -n "$(echo $headerBytes | tr -d ' ' | grep '<<PDB:')"; then
echo "$f: Portable Database (PDB) data"
elif test -n "$(echo $headerBytes | tr -d ' \\' | grep 'HDFcrnl')"; then
echo "$f: Hierarchical Data Format version 5 (HDF5) data"
else
headerBytes=$(od -t x1 -N 4 $f)
if test -n "$(echo $headerBytes | grep '0000000 0e 03 13 01')"; then
echo "$f: Hierarchical Data Format version 4 (HDF4) data"
else
file $f
fi
fi
else # not a regular file
file $f
fi
done
Now, this won't tell a user if the file was produced by Silo but it will tell a
user whether the file appears to be HDF5, PDB or HDF4 and that is usually
sufficient for Silo users
Now, from within C code, its sufficient for me to just attempt to open the file
using Silo's open routines. That process involves looking for telltale signs
the file was produced by Silo. It turns out the Silo library creates a couple
of somewhat uniquley named char datasets in the root group of the file,
"_silolibinfo" and "_hdf5libinfo". So, if Silo's open succeeds, its a fairly
certain sign the file was actually produced by Silo.
In a cursory look over the libsrc4 sources in netCDF distro, I see a few things
that might give a hint a file was created with netCDF. . .
First, in NC_CLASSIC_MODEL, an attribute gets attached to the root group named
"_nc3_strict". So, the existence of an attribute on the root group by that name
would suggest the HDF5 file was generated by netCDF.
Also, I tested a simple case of nc_open, nc_def_dim, etc. nc_close to see what
it produced.
It appears to produce datasets for each 'dimension' defined with two attributes
named "CLASS" and "NAME". The value of "CLASS" is a 16 char null-terminated
string "DIMENSION_SCALE" and the value of "NAME" is a 64-char null terminated
string of the form "This is a netCDF dimension but not a netCDF variable.
%d"
Finally, someone does an nc_open followed immediately by nc_close, then I don't
think the resulting HDF5 file has anything to suggest it might have been
created by netCDF. OTOH, the file is also devoid of any objects in that case
and so who cares whether netCDF produced it.
Hope that helps.
Mark
From: Hdf-forum
<hdf-forum-bounces@xxxxxxxxxxxxxxxxxx<mailto:hdf-forum-bounces@xxxxxxxxxxxxxxxxxx>>
on behalf of John Shalf <jshalf@xxxxxxx<mailto:jshalf@xxxxxxx>>
Reply-To: HDF Users Discussion List
<hdf-forum@xxxxxxxxxxxxxxxxxx<mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>
Date: Wednesday, March 2, 2016 1:02 PM
To: HDF Users Discussion List
<hdf-forum@xxxxxxxxxxxxxxxxxx<mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>
Cc: "netcdfgroup@xxxxxxxxxxxxxxxx<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>"
<netcdfgroup@xxxxxxxxxxxxxxxx<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>>, Ward
Fisher <wfisher@xxxxxxxx<mailto:wfisher@xxxxxxxx>>
Subject: Re: [Hdf-forum] Detecting netCDF versus HDF5
Perhaps NetCDF (and other higher-level APIs that are built on top of HDF5)
should include an attribute attached to the root group that identifies the name
and version of the API that created the file? (adopt this as a convention)
-john
On Mar 2, 2016, at 12:55 PM, Pedro Vicente
<pedro.vicente@xxxxxxxxxxxxxxxxxx<mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>>
wrote:
Hi Ward
As you know, Data Explorer is going to be a general purpose data reader for
many formats, including HDF5 and netCDF.
Here
http://www.space-research.org/
Regarding the handling of both HDF5 and netCDF, it seems there is a potential
issue, which is, how to tell if any HDF5 file was saved by the HDF5 API or by
the netCDF API?
It seems to me that this is not possible. Is this correct?
netCDF uses an internal function NC_check_file_type to examine the first few
bytes of a file, and for example for any HDF5 file the test is
/* Look at the magic number */
/* Ignore the first byte for HDF */
if(magic[1] == 'H' && magic[2] == 'D' && magic[3] == 'F') {
*filetype = FT_HDF;
*version = 5;
The problem is that this test works for any HDF5 file and for any netCDF file,
which makes it impossible to tell which is which.
Which makes it impossible for any general purpose data reader to decide to use
the netCDF API or the HDF5 API.
I have a possible solution for this , but before going any further, I would
just like to confirm that
1) Is indeed not possible
2) See if you have a solid workaround for this, excluding the dumb ones,
for example deciding on a extension .nc or .h5, or traversing the HDF5 file to
see if it's non netCDF conforming one. Yes, to further complicate things, it is
possible that the above test says OK for a HDF5 file, but then the read by the
netCDF API fails because the file is a HDF5 non netCDF conformant
Thanks
----------------------
Pedro Vicente
pedro.vicente@xxxxxxxxxxxxxxxxxx<mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>
http://www.space-research.org/
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@xxxxxxxxxxxxxxxxxx<mailto:Hdf-forum@xxxxxxxxxxxxxxxxxx>
http://secure-web.cisco.com/1r-EJFFfg6rWlpQsvXstBNTjaHQaKT_NkYRN0Jj_f-Z3EK0-hs6IbYc8XUBRyPsH3mU3CS0iiY7_qnchCA0QxNzQt270d_2HikCwpAWFmuHdacin62eaODutktDSOULIJmVbVYqFVSKWPzoX7kdP0yN9wIzSFxZfTwfhU8ebsN409xRg1PsW_8cvNiWzxDNm9wv9yBf9yK6nkEm-bOx2S0kBLbg9WfIChWzZrkpE3AHU9I-c2ZRH_IN-UF4g_g0_Dh4qE1VETs7tZTfKd1ox1MtBmeyKf7EKUCd3ezR9EbI5tK4hCU5qW4v5WWOxOrD17e8yCVmob27xz84Lr3bCK5wIQdH5VzFRTtyaAhudpt9E/http%3A%2F%2Flists.hdfgroup.org%2Fmailman%2Flistinfo%2Fhdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@xxxxxxxxxxxxxxxxxx<mailto:Hdf-forum@xxxxxxxxxxxxxxxxxx>
http://secure-web.cisco.com/1r-EJFFfg6rWlpQsvXstBNTjaHQaKT_NkYRN0Jj_f-Z3EK0-hs6IbYc8XUBRyPsH3mU3CS0iiY7_qnchCA0QxNzQt270d_2HikCwpAWFmuHdacin62eaODutktDSOULIJmVbVYqFVSKWPzoX7kdP0yN9wIzSFxZfTwfhU8ebsN409xRg1PsW_8cvNiWzxDNm9wv9yBf9yK6nkEm-bOx2S0kBLbg9WfIChWzZrkpE3AHU9I-c2ZRH_IN-UF4g_g0_Dh4qE1VETs7tZTfKd1ox1MtBmeyKf7EKUCd3ezR9EbI5tK4hCU5qW4v5WWOxOrD17e8yCVmob27xz84Lr3bCK5wIQdH5VzFRTtyaAhudpt9E/http%3A%2F%2Flists.hdfgroup.org%2Fmailman%2Flistinfo%2Fhdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5