Re: [netcdfgroup] netcdfgroup Digest, Vol 1126, Issue 2

Since Pedro asked earlier about how NCL distinguishes between NetCDF4
and HDF5, I'm going to add my 2 cents to what now appears to be the
longest thread ever on this mailing list.

First a bit of background. Traditionally NCL has distinguished among
file formats based solely on file extensions. If a file name ends with
".nc" then it is considered to be a NetCDF file and will be opened
using the NetCDF library calls. Additionally there is an idiosyncratic
feature where you can add an "virtual" extension to a file name to
specify the format you want to use. For example, if the file is name
"test", you can open it as "test.h5" to open it using HDF5 calls.
Given this name NCL will look first for a file called "test.h5" and if
that is not found then it will look for "test". You can even add
extensions to files that already have them to open a file using
another format: e.g. test.hdf.nc.

But recent versions of NCL attempt to figure out the format of files
that do not have recognized extensions. And that means we have
definitely run into the issue that Pedro originally brought up. We
want our HDF5 module to handle HDF5 files on their own terms,
including, e.g., recognizing reference types. For now, we first try to
see if the file can be opened using the NetCDF library, and if not, we
try various versions of HDF. But this is not ideal, because we only
want to open files that are explicitly written using NetCDF4 as
NetCDF. So it is indeed welcome news that there will be global
attributes added to explicitly identify the file as NetCDF4. However,
it also would be nice if nc_inq_format or nc_inq_format_extended could
be adjusted to give a definitive answer as to whether the file was
created as NetCDF4. I have to admit I was quite surprised to discover
that nc_inq_format_extended would not answer this seemingly obvious
(to me at least) question.
 -Dave Brown
  NCL technical architect


On Sat, Apr 23, 2016 at 10:21 AM,  <netcdfgroup-request@xxxxxxxxxxxxxxxx> wrote:
> Send netcdfgroup mailing list submissions to
>         netcdfgroup@xxxxxxxxxxxxxxxx
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.unidata.ucar.edu/mailman/listinfo/netcdfgroup
> or, via email, send a message with subject or body 'help' to
>         netcdfgroup-request@xxxxxxxxxxxxxxxx
>
> You can reach the person managing the list at
>         netcdfgroup-owner@xxxxxxxxxxxxxxxx
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of netcdfgroup digest..."
>
>
> Today's Topics:
>
>    1. Re: [CF-metadata] [Hdf-forum] Detecting netCDF versus HDF5 --
>       PROPOSED SOLUTIONS --REQUEST FOR COMMENTS (John Caron)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 22 Apr 2016 21:57:51 -0600
> From: John Caron <jcaron1129@xxxxxxxxx>
> To: Pedro Vicente <pedro.vicente@xxxxxxxxxxxxxxxxxx>
> Cc: cf-metadata@xxxxxxxxxxxx,   NetCDF-Java community
>         <netcdf-java@xxxxxxxxxxxxxxxx>, netcdfgroup@xxxxxxxxxxxxxxxx
> Subject: Re: [netcdfgroup] [CF-metadata] [Hdf-forum] Detecting netCDF
>         versus HDF5 -- PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
> Message-ID:
>         <CAN1vDkp3iYVaBcEvoC8irp83AVKT85Mq+h75PWU_L-dExjWcMA@xxxxxxxxxxxxxx>
> Content-Type: text/plain; charset="utf-8"
>
> Here are the blogs:
>
> http://www.unidata.ucar.edu/blogs/developer/en/entry/dimensions_scales
> http://www.unidata.ucar.edu/blogs/developer/en/entry/dimension_scale2
> http://www.unidata.ucar.edu/blogs/developer/en/entry/dimension_scales_part_3
> http://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf4_shared_dimensions
> http://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf4_use_of_dimension_scales
>
> On Fri, Apr 22, 2016 at 7:57 AM, Pedro Vicente <
> pedro.vicente@xxxxxxxxxxxxxxxxxx> wrote:
>
>> John
>>
>> >>>i have written various blogs on the unidata site about why netcdf4 !=
>> hdf5, and what the unique signature for shared dimensions looks like, in
>> >>>case you want details.
>>
>> yes, I am interested, I had the impression by looking at the code some
>> years ago that netCDF writes some unique name attributes somewhere
>>
>> ----------------------
>> Pedro Vicente
>> pedro.vicente@xxxxxxxxxxxxxxxxxx
>> https://twitter.com/_pedro__vicente
>> http://www.space-research.org/
>>
>>
>>
>>
>> ----- Original Message -----
>> *From:* John Caron <jcaron1129@xxxxxxxxx>
>> *To:* Pedro Vicente <pedro.vicente@xxxxxxxxxxxxxxxxxx>
>> *Cc:* cf-metadata@xxxxxxxxxxxx ; Discussion forum for the NeXus data
>> format <nexus@xxxxxxxxxxxxxxx> ; netcdfgroup@xxxxxxxxxxxxxxxx ; Dennis
>> Heimbigner <dmh@xxxxxxxx> ; NetCDF-Java community
>> <netcdf-java@xxxxxxxxxxxxxxxx>
>> *Sent:* Thursday, April 21, 2016 11:11 PM
>> *Subject:* Re: [CF-metadata] [netcdfgroup] [Hdf-forum] Detecting netCDF
>> versus HDF5 -- PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
>>
>> 1) I completely agree with the idea of adding system metadata that
>> indicates the library version(s) that wrote the file.
>>
>> 2) the way shared dimensions are implemented by netcdf4 is a unique
>> signature that would likely identify (100 - epsilon) % of real data files
>> in the wild. One could add such detection to the netcdf4 and/or hdf5
>> libraries, and/or write a utility program to detect.
>>
>> there are 2 variants:
>>
>> 2.1) one could write a netcdf4 file without shared dimensions, though im
>> pretty sure no one does. but you could argue then that its fine to just
>> treat it as an hdf5 file and read through hdf5 library
>>
>> 2.2) one could write a netcdf4 file with hdf5 library, if you knew what
>> you are doing. i have heard of this happening. but then you could argue
>> that its really a netcdf4 file and you should use netcdf library to read .
>>
>> i have written various blogs on the unidata site about why netcdf4 !=
>> hdf5, and what the unique signature for shared dimensions looks like, in
>> case you want details.
>>
>> On Thu, Apr 21, 2016 at 4:18 PM, Pedro Vicente <
>> pedro.vicente@xxxxxxxxxxxxxxxxxx> wrote:
>>
>>> If you have hdf5 files that should be readable, then I will undertake to
>>>> look at them and see what the problem is.
>>>>
>>>
>>>
>>> ok, thank you
>>>
>>> WRT to old files:  We could produce a utility that would redef the file
>>>> and insert the
>>>>      _NCProperties attribute. This would allow someone to wholesale
>>>>      mark old files.
>>>>
>>>
>>>
>>> Excellent idea , Dennis
>>>
>>> ----------------------
>>> Pedro Vicente
>>> pedro.vicente@xxxxxxxxxxxxxxxxxx
>>> https://twitter.com/_pedro__vicente
>>> http://www.space-research.org/
>>>
>>>
>>> ----- Original Message ----- From: <dmh@xxxxxxxx>
>>> To: "Pedro Vicente" <pedro.vicente@xxxxxxxxxxxxxxxxxx>; <
>>> cf-metadata@xxxxxxxxxxxx>; "Discussion forum for the NeXus data format" <
>>> nexus@xxxxxxxxxxxxxxx>; <netcdfgroup@xxxxxxxxxxxxxxxx>
>>> Sent: Thursday, April 21, 2016 5:02 PM
>>> Subject: Re: [netcdfgroup] [Hdf-forum] Detecting netCDF versus HDF5 --
>>> PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
>>>
>>>
>>> If you have hdf5 files that should be readable, then I will undertake to
>>>> look at them and see what the problem is.
>>>> WRT to old files:  We could produce a utility that would redef the file
>>>> and insert the
>>>>      _NCProperties attribute. This would allow someone to wholesale
>>>>      mark old files.
>>>> =Dennis Heimbigner
>>>>   Unidata
>>>>
>>>>
>>>> On 4/21/2016 2:17 PM, Pedro Vicente wrote:
>>>>
>>>>> Dennis
>>>>>
>>>>> I am in the process of adding a global attribute in the root group
>>>>>>>>>
>>>>>>>> that captures both the netcdf library version and the hdf5 library
>>>>>> version
>>>>>> whenever a netcdf file is created. The current  form is
>>>>>> _NCProperties="version=...|netcdflibversion=...|hdflibversion=..."
>>>>>>
>>>>>
>>>>>
>>>>> ok, good to know, thank you
>>>>>
>>>>>
>>>>> > 1. I am open to suggestions about changing the format or adding
>>>>>>>> info > to it.
>>>>>>>>
>>>>>>>
>>>>>
>>>>> I personally don't care, anything that uniquely identifies a netCDF
>>>>> file (HDF5 based) as such will work
>>>>>
>>>>>
>>>>> 2. Of course this attribute will not exist in files written using older
>>>>>>>>
>>>>>>> versions of the netcdf library, but at least the process will have
>>>>>> begun.
>>>>>>
>>>>>
>>>>> yes
>>>>>
>>>>>
>>>>> 3. This technically does not address the original issue because there
>>>>>> exist
>>>>>>      hdf5 files  not written by netcdf that are still compatible with
>>>>>> and can be
>>>>>>      read by netcdf. Not sure this case is important or not.
>>>>>>
>>>>>
>>>>> there will always be HDF5 files  not written by netcdf that netCDF will
>>>>> read as we are now.
>>>>>
>>>>> this is not really the issue, but you just made a further issue :-)
>>>>>
>>>>> the issue is that I would like an application that reads a netCDF (HDF5
>>>>> based) file to decide to use the netCDF or HDF5 API.
>>>>> your attribute writing will do , for future files.
>>>>> for older nertCDF files there may be  a way to detect the current
>>>>> attributes and data structures to see if we can make it "identify itself"
>>>>> as netCDF. A bit of debugging will confirm that, since Dimension Scales
>>>>> are used, that would be an (imperfect maybe) way to do it
>>>>>
>>>>> regarding the "further issue " above
>>>>>
>>>>> you could go one step further and for any HDF5 files  not written by
>>>>> netcdf , you could make netCDF reject the file reading,
>>>>> because it's not "netCDF compliant".
>>>>> Since having netCDF read pure HDF5 files is not a problem (at least for
>>>>> me), I don't know if you would want to do this, just an idea.
>>>>> In my mind taking complexity and ambiguities of problems is always a
>>>>> good thing
>>>>>
>>>>>
>>>>> ah, I forgot one thing, related to this
>>>>>
>>>>>
>>>>> In the past I have found several pure HDF5 files that netCDF failed in
>>>>> reading.
>>>>> Since netCDF is HDF5 binary compatible, one would expect that all HDF5
>>>>> files will be read by netCDF.
>>>>> Except if you specifically wrote something in the code that makes it to
>>>>> fail if some condition is not met,
>>>>> This was a while ago, I'll try to find those cases and I'll send a bug
>>>>> report to the bug report email
>>>>>
>>>>> ----------------------
>>>>> Pedro Vicente
>>>>> pedro.vicente@xxxxxxxxxxxxxxxxxx
>>>>> https://twitter.com/_pedro__vicente
>>>>> http://www.space-research.org/
>>>>>
>>>>> ----- Original Message ----- From: <dmh@xxxxxxxx>
>>>>> To: "Pedro Vicente" <pedro.vicente@xxxxxxxxxxxxxxxxxx>; "HDF Users
>>>>> Discussion List" <hdf-forum@xxxxxxxxxxxxxxxxxx>; <
>>>>> cf-metadata@xxxxxxxxxxxx>; "Discussion forum for the NeXus data
>>>>> format" <nexus@xxxxxxxxxxxxxxx>; <netcdfgroup@xxxxxxxxxxxxxxxx>
>>>>> Cc: "John Shalf" <jshalf@xxxxxxx>; <Richard.E.Ullman@xxxxxxxx>;
>>>>> "Marinelli, Daniel J. (GSFC-5810)" <daniel.j.marinelli@xxxxxxxx>;
>>>>> "Miller, Mark C." <miller86@xxxxxxxx>
>>>>> Sent: Thursday, April 21, 2016 2:30 PM
>>>>> Subject: Re: [netcdfgroup] [Hdf-forum] Detecting netCDF versus HDF5 --
>>>>> PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
>>>>>
>>>>>
>>>>> I am in the process of adding a global attribute in the root group
>>>>>> that captures both the netcdf library version and the hdf5 library
>>>>>> version
>>>>>> whenever a netcdf file is created. The current  form is
>>>>>> _NCProperties="version=...|netcdflibversion=...|hdflibversion=..."
>>>>>> Where version is the version of the _NCProperties attribute and the
>>>>>> others
>>>>>> are e.g. 1.8.18 or 4.4.1-rc1.
>>>>>> Issues:
>>>>>> 1. I am open to suggestions about changing the format or adding info
>>>>>> to it.
>>>>>> 2. Of course this attribute will not exist in files written using
>>>>>> older versions
>>>>>>     of the netcdf library, but at least the process will have begun.
>>>>>> 3. This technically does not address the original issue because there
>>>>>> exist
>>>>>>      hdf5 files  not written by netcdf that are still compatible with
>>>>>> and can be
>>>>>>      read by netcdf. Not sure this case is important or not.
>>>>>> =Dennis Heimbigner
>>>>>>    Unidata
>>>>>>
>>>>>>
>>>>>> On 4/21/2016 9:33 AM, Pedro Vicente wrote:
>>>>>>
>>>>>>> DETECTING HDF5 VERSUS NETCDF GENERATED FILES
>>>>>>> REQUEST FOR COMMENTS
>>>>>>> AUTHOR: Pedro Vicente
>>>>>>>
>>>>>>> AUDIENCE:
>>>>>>> 1) HDF, netcdf developers,
>>>>>>> Ed Hartnett
>>>>>>> Kent Yang
>>>>>>> 2) HDF, netcdf users, that replied to this thread
>>>>>>> Miller, Mark C.
>>>>>>> John Shalf
>>>>>>> 3 ) netcdf tools developers
>>>>>>> Mary Haley  , NCL
>>>>>>> 4) HDF, netcdf managers and sponsors
>>>>>>> David Pearah  , CEO HDF Group
>>>>>>> Ward Fisher, UCAR
>>>>>>> Marinelli, Daniel J. , Richard Ullmman, Christopher Lynnes, NASA
>>>>>>> 5)
>>>>>>> [CF-metadata] list
>>>>>>> After this thread started 2 months ago, there was an annoucement on
>>>>>>> the [CF-metadata] mail list
>>>>>>> about
>>>>>>> "a meeting to discuss current and future netCDF-CF efforts and
>>>>>>> directions.
>>>>>>> The meeting will be held on 24-26 May 2016 in Boulder, CO, USA at the
>>>>>>> UCAR Center Green facility."
>>>>>>> This would be a good topic to put on the agenda, maybe?
>>>>>>> THE PROBLEM:
>>>>>>> Currently it is impossible to detect if an HDF5 file was generated by
>>>>>>> the HDF5 API or by the netCDF API.
>>>>>>> See previous email about the reasons why.
>>>>>>> WHY THIS MATTERS:
>>>>>>> Software applications that need to handle both netCDF and HDF5 files
>>>>>>> cannot decide which API to use.
>>>>>>> This includes popular visualization tools like IDL, Matlab, NCL, HDF
>>>>>>> Explorer.
>>>>>>> SOLUTIONS PROPOSED: 2
>>>>>>> SOLUTION 1: Add a flag to HDF5 source
>>>>>>> The hdf5 format specification, listed here
>>>>>>> https://www.hdfgroup.org/HDF5/doc/H5.format.html
>>>>>>> describes a sequence of bytes in the file layout that have special
>>>>>>> meaning for the HDF5 API. It is common practice, when designing a data
>>>>>>> format,
>>>>>>> so leave some fields "reserved for future use".
>>>>>>> This solution makes use of one of these empty  "reserved for future
>>>>>>> use" spaces to save a byte (for example) that describes an enumerator
>>>>>>> of "HDF5 compatible formats".
>>>>>>> An "HDF5 compatible format" is a data format that uses the HDF5 API
>>>>>>> at a lower level (usually hidden from the user of the upper API),
>>>>>>> and providing its own API.
>>>>>>> This category can still be divide in 2 formats:
>>>>>>> 1) A "pure HDF5 compatible format". Example, NeXus
>>>>>>> http://www.nexusformat.org/
>>>>>>> NeXus just writes some metadata (attributes) on top of the HDF5 API,
>>>>>>> that has some special meaning for the NeXus community
>>>>>>> 2) A "non pure HDF5 compatible format". Example, netCDF
>>>>>>> Here, the format adds some extra feature besides HDF5. In the case of
>>>>>>> netCDF, these are shared dimensions between variables.
>>>>>>> This sub-division between 1) and 2) is irrelevant for the problem and
>>>>>>> solution in question
>>>>>>> The solution consists of writing a different enumerator value on the
>>>>>>> "reserved for future use" space. For example
>>>>>>> Value decimal 0 (current value): This file was generated by the HDF5
>>>>>>> API (meaning the HDF5 only API)
>>>>>>> Value decimal 1: This file was generated by the netCDF API (using
>>>>>>> HDF5)
>>>>>>> Value decimal 2: This file was generated by <put here another HDF5
>>>>>>> based format>
>>>>>>> and so on
>>>>>>> The advantage of this solution is that this process involves 2
>>>>>>> parties: the HDF Group and the other format's organization.
>>>>>>> This allows the HDF Group to "keep track" of new HDF5 based formats .
>>>>>>> It allows to make the other format "HDF5 certified" .
>>>>>>> SOLUTION 2: Add some metadata to the other API on top of HDF5
>>>>>>> This is what Nexus uses.
>>>>>>> A Nexus file on creation writes several attributes on the root group,
>>>>>>> like "NeXus_version" and other numeric data.
>>>>>>> This is done using the public HDF5 API calls.
>>>>>>> The solution for netCDF consists of the same approach, just write
>>>>>>> some specific attributes, and a special netCDF API to write/read them.
>>>>>>> This solutions just requires the work of one party (the netCDF group)
>>>>>>> END OF RFC
>>>>>>> In reply to people that commented in the thread
>>>>>>> @John Shalf
>>>>>>> >>Perhaps NetCDF (and other higher-level APIs that are built on top of
>>>>>>> HDF5) should include an attribute attached
>>>>>>> >>to the root group that identifies the name and version of the API
>>>>>>> that created the file?  (adopt this as a convention)
>>>>>>> yes, that's one way to do it, Solution 2 above
>>>>>>> @Mark Miller
>>>>>>> >>>Hmmm. Is there any big reason NOT to try to read a netCDF produced
>>>>>>> HDF5 file with the native HDF5 library if someone so chooses?
>>>>>>> It's possible to read a netCDF file using HDF5, yes.
>>>>>>> There are 2 things that you will miss doing this:
>>>>>>> 1) the ability to inquire about shared netCDF dimensions.
>>>>>>> 2) the ability to read remotely with openDAP.
>>>>>>> Reading with HDF5 also exposes metadata that is supposed to be
>>>>>>> private to netCDF. See below
>>>>>>> >>>> And, attempting  to read an HDF5 file produced by Silo using just
>>>>>>> the HDF5 library (e.g. w/o Silo) is a major pain.
>>>>>>> This I don't understand. Why not read the Silo file with the Silo API?
>>>>>>> That's the all purpose of this issue, each higher level API on top of
>>>>>>> HDF5 should be able to detect "itself".
>>>>>>> I am not familiar with Silo, but if Silo cannot do this, then you
>>>>>>> have the same design flaw that netCDF has.
>>>>>>>
>>>>>>> >>> In a cursory look over the libsrc4 sources in netCDF distro, I see
>>>>>>> a few things that might give a hint a file was created with netCDF.  .
>>>>>>> .
>>>>>>> >>>> First, in NC_CLASSIC_MODEL, an attribute gets attached to the
>>>>>>> root group named "_nc3_strict". So, the existence of an attribute on
>>>>>>> the root group by that name would suggest the HDF5 file was generated by
>>>>>>> netCDF.
>>>>>>> I think this is done only by the "old" netCDF3 format.
>>>>>>> >>>>> Also, I tested a simple case of nc_open, nc_def_dim, etc.
>>>>>>> nc_close to see what it produced.
>>>>>>> >>>> It appears to produce datasets for each 'dimension' defined with
>>>>>>> two attributes named "CLASS" and "NAME".
>>>>>>> This is because netCDF uses the HDF5 Dimension Scales API internally
>>>>>>> to keep track of shared dimensions. These are internal attributes
>>>>>>> of Dimension Scales. This approach would not work because an HDF5
>>>>>>> only file with Dimension Scales would have the same attributes.
>>>>>>>
>>>>>>> >>>> I like John's suggestion here.
>>>>>>> >>>>>But, any code you add to any applications now will work *only*
>>>>>>> for files that were produced post-adoption of this convention.
>>>>>>> yes. there are 2 actions to take here.
>>>>>>> 1) fix the issue for the future
>>>>>>> 2) try to retroactively have some workaround that makes possible now
>>>>>>> to differentiate a HDF5/netCDF files made before the adopted convention
>>>>>>> see below
>>>>>>>
>>>>>>> >>>> In VisIt, we support >140 format readers. Over 20 of those are
>>>>>>> different variants of HDF5 files (H5part, Xdmf, Pixie, Silo, Samrai,
>>>>>>> netCDF, Flash, Enzo, Chombo, etc., etc.)
>>>>>>> >>>>When opening a file, how does VisIt figure out which plugin to
>>>>>>> use? In particular, how do we avoid one poorly written reader plugin
>>>>>>> (which may be the wrong one for a given file) from preventing the 
>>>>>>> correct
>>>>>>> one from being found. Its kinda a hard problem.
>>>>>>>
>>>>>>> Yes, that's the problem we are trying to solve. I have to say, that
>>>>>>> is quick a list of HDF5 based formats there.
>>>>>>> >>>> Some of our discussion is captured here. . .
>>>>>>> http://www.visitusers.org/index.php?title=Database_Format_Detection
>>>>>>> I"ll check it out, thank you for the suggestions
>>>>>>> @Ed Hartnett
>>>>>>> >>>I must admit that when putting netCDF-4 together I never considered
>>>>>>> that someone might want to tell the difference between a "native"
>>>>>>> HDF5 file and a netCDF-4/HDF5 file.
>>>>>>> >>>>>Well, you can't think of everything.
>>>>>>> This is a major design flaw.
>>>>>>> If you are in the business of designing data file formats, one of the
>>>>>>> things you have to do is how to make it possible to identify it from the
>>>>>>> other formats.
>>>>>>>
>>>>>>> >>> I agree that it is not possible to canonically tell the
>>>>>>> difference. The netCDF-4 API does use some special attributes to
>>>>>>> track named dimensions,
>>>>>>> >>>>and to tell whether classic mode should be enforced. But it can
>>>>>>> easily produce files without any named dimensions, etc.
>>>>>>> >>>So I don't think there is any easy way to tell.
>>>>>>> I remember you wrote that code together with Kent Yang from the HDF
>>>>>>> Group.
>>>>>>> At the time I was with the HDF Group but unfortunately I did follow
>>>>>>> closely what you were doing.
>>>>>>> I don't remember any design document being circulated that explains
>>>>>>> the internals of the "how to" make the netCDF (classic) model of shared
>>>>>>> dimensions
>>>>>>> use the hierarchical group model of HDF5.
>>>>>>> I know this was done using the HDF5 Dimension Scales (that I wrote),
>>>>>>> but is there any design document that explains it?
>>>>>>> Maybe just some internal email exchange between you and Kent Yang?
>>>>>>> Kent, how are you?
>>>>>>> Do you remember having any design document that explains this?
>>>>>>> Maybe something like a unique private attribute that is written
>>>>>>> somewhere in the netCDF file?
>>>>>>>
>>>>>>> @Mary Haley, NCL
>>>>>>> NCL is a widely used tool that handles both netCDF and HDF5
>>>>>>> Mary, how are you?
>>>>>>> How does NCL deal with the case of reading both pure HDF5 files and
>>>>>>> netCDF files that use HDF5?
>>>>>>> Would you be interested in joining a community based effort to deal
>>>>>>> with this, in case this is an issue for you?
>>>>>>>
>>>>>>> @David Pearah  , CEO HDF Group
>>>>>>> I volunteer to participate in the effort of this RFC together with
>>>>>>> the HDF Group (and netCDF Group).
>>>>>>> Maybe we could make a "task force" between HDF Group, netCDF Group
>>>>>>> and any volunteer (such as tools developers that happen to be in these 
>>>>>>> mail
>>>>>>> lists)?
>>>>>>> The "task force" would have 2 tasks:
>>>>>>> 1) make a HDF5 based convention for the future and
>>>>>>> 2) try to retroactively salvage the current design issue of netCDF
>>>>>>> My phone is 217-898-9356, you are welcome to call in anytime.
>>>>>>> ----------------------
>>>>>>> Pedro Vicente
>>>>>>> pedro.vicente@xxxxxxxxxxxxxxxxxx <mailto:
>>>>>>> pedro.vicente@xxxxxxxxxxxxxxxxxx>
>>>>>>> https://twitter.com/_pedro__vicente
>>>>>>> http://www.space-research.org/
>>>>>>>
>>>>>>>     ----- Original Message -----
>>>>>>>     *From:* Miller, Mark C. <mailto:miller86@xxxxxxxx>
>>>>>>>     *To:* HDF Users Discussion List <mailto:
>>>>>>> hdf-forum@xxxxxxxxxxxxxxxxxx>
>>>>>>>     *Cc:* netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>>>     <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; Ward Fisher
>>>>>>>     <mailto:wfisher@xxxxxxxx>
>>>>>>>     *Sent:* Wednesday, March 02, 2016 7:07 PM
>>>>>>>     *Subject:* Re: [Hdf-forum] Detecting netCDF versus HDF5
>>>>>>>
>>>>>>>     I like John's suggestion here.
>>>>>>>
>>>>>>>     But, any code you add to any applications now will work *only* for
>>>>>>>     files that were produced post-adoption of this convention.
>>>>>>>
>>>>>>>     There are probably a bazillion files out there at this point that
>>>>>>>     don't follow that convention and you probably still want your
>>>>>>>     applications to be able to read them.
>>>>>>>
>>>>>>>     In VisIt, we support >140 format readers. Over 20 of those are
>>>>>>>     different variants of HDF5 files (H5part, Xdmf, Pixie, Silo,
>>>>>>>     Samrai, netCDF, Flash, Enzo, Chombo, etc., etc.) When opening a
>>>>>>>     file, how does VisIt figure out which plugin to use? In
>>>>>>>     particular, how do we avoid one poorly written reader plugin
>>>>>>>     (which may be the wrong one for a given file) from preventing the
>>>>>>>     correct one from being found. Its kinda a hard problem.
>>>>>>>
>>>>>>>     Some of our discussion is captured here. . .
>>>>>>>
>>>>>>> http://www.visitusers.org/index.php?title=Database_Format_Detection
>>>>>>>
>>>>>>>     Mark
>>>>>>>
>>>>>>>
>>>>>>>     From: Hdf-forum <hdf-forum-bounces@xxxxxxxxxxxxxxxxxx
>>>>>>>     <mailto:hdf-forum-bounces@xxxxxxxxxxxxxxxxxx>> on behalf of John
>>>>>>>     Shalf <jshalf@xxxxxxx <mailto:jshalf@xxxxxxx>>
>>>>>>>     Reply-To: HDF Users Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>>     <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>
>>>>>>>     Date: Wednesday, March 2, 2016 1:02 PM
>>>>>>>     To: HDF Users Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>>     <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>
>>>>>>>     Cc: "netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>>>     <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>"
>>>>>>>     <netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>>>     <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>>, Ward Fisher
>>>>>>>     <wfisher@xxxxxxxx <mailto:wfisher@xxxxxxxx>>
>>>>>>>     Subject: Re: [Hdf-forum] Detecting netCDF versus HDF5
>>>>>>>
>>>>>>>         Perhaps NetCDF (and other higher-level APIs that are built on
>>>>>>>         top of HDF5) should include an attribute attached to the root
>>>>>>>         group that identifies the name and version of the API that
>>>>>>>         created the file?  (adopt this as a convention)
>>>>>>>
>>>>>>>         -john
>>>>>>>
>>>>>>>             On Mar 2, 2016, at 12:55 PM, Pedro Vicente
>>>>>>>             <pedro.vicente@xxxxxxxxxxxxxxxxxx
>>>>>>> <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>> wrote:
>>>>>>>             Hi Ward
>>>>>>>             As you know, Data Explorer is going to be a general
>>>>>>>             purpose data reader for many formats, including HDF5 and
>>>>>>>             netCDF.
>>>>>>>             Here
>>>>>>>             http://www.space-research.org/
>>>>>>>             Regarding the handling of both HDF5 and netCDF, it seems
>>>>>>>             there is a potential issue, which is, how to tell if any
>>>>>>>             HDF5 file was saved by the HDF5 API or by the netCDF API?
>>>>>>>             It seems to me that this is not possible. Is this correct?
>>>>>>>             netCDF uses an internal function NC_check_file_type to
>>>>>>>             examine the first few bytes of a file, and for example for
>>>>>>>             any HDF5 file the test is
>>>>>>>             /* Look at the magic number */
>>>>>>>                /* Ignore the first byte for HDF */
>>>>>>>                if(magic[1] == 'H' && magic[2] == 'D' && magic[3] ==
>>>>>>> 'F') {
>>>>>>>                  *filetype = FT_HDF;
>>>>>>>                  *version = 5;
>>>>>>>             The problem is that this test works for any HDF5 file and
>>>>>>>             for any netCDF file, which makes it impossible to tell
>>>>>>>             which is which.
>>>>>>>             Which makes it impossible for any general purpose data
>>>>>>>             reader to decide to use the netCDF API or the HDF5 API.
>>>>>>>             I have a possible solution for this , but before going any
>>>>>>>             further, I would just like to confirm that
>>>>>>>             1)      Is indeed not possible
>>>>>>>             2)      See if you have a solid workaround for this,
>>>>>>>             excluding the dumb ones, for example deciding on a
>>>>>>>             extension .nc or .h5, or traversing the HDF5 file to see
>>>>>>>             if it's non netCDF conforming one. Yes, to further
>>>>>>>             complicate things, it is possible that the above test says
>>>>>>>             OK for a HDF5 file, but then the read by the netCDF API
>>>>>>>             fails because the file is a HDF5 non netCDF conformant
>>>>>>>             Thanks
>>>>>>>             ----------------------
>>>>>>>             Pedro Vicente
>>>>>>>             pedro.vicente@xxxxxxxxxxxxxxxxxx
>>>>>>>             <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>
>>>>>>>             http://www.space-research.org/
>>>>>>>             _______________________________________________
>>>>>>>             Hdf-forum is for HDF software users discussion.
>>>>>>>             Hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>>             <mailto:Hdf-forum@xxxxxxxxxxxxxxxxxx>
>>>>>>>
>>>>>>>
>>>>>>> http://secure-web.cisco.com/1r-EJFFfg6rWlpQsvXstBNTjaHQaKT_NkYRN0Jj_f-Z3EK0-hs6IbYc8XUBRyPsH3mU3CS0iiY7_qnchCA0QxNzQt270d_2HikCwpAWFmuHdacin62eaODutktDSOULIJmVbVYqFVSKWPzoX7kdP0yN9wIzSFxZfTwfhU8ebsN409xRg1PsW_8cvNiWzxDNm9wv9yBf9yK6nkEm-bOx2S0kBLbg9WfIChWzZrkpE3AHU9I-c2ZRH_IN-UF4g_g0_Dh4qE1VETs7tZTfKd1ox1MtBmeyKf7EKUCd3ezR9EbI5tK4hCU5qW4v5WWOxOrD17e8yCVmob27xz84Lr3bCK5wIQdH5VzFRTtyaAhudpt9E/http%3A%2F%2Flists.hdfgroup.org%2Fmailman%2Flistinfo%2Fhdf-forum_lists.hdfgroup.org
>>>>>>>             Twitter: https://twitter.com/hdf5
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         _______________________________________________
>>>>>>>         Hdf-forum is for HDF software users discussion.
>>>>>>>         Hdf-forum@xxxxxxxxxxxxxxxxxx <mailto:
>>>>>>> Hdf-forum@xxxxxxxxxxxxxxxxxx>
>>>>>>>
>>>>>>>
>>>>>>> http://secure-web.cisco.com/1r-EJFFfg6rWlpQsvXstBNTjaHQaKT_NkYRN0Jj_f-Z3EK0-hs6IbYc8XUBRyPsH3mU3CS0iiY7_qnchCA0QxNzQt270d_2HikCwpAWFmuHdacin62eaODutktDSOULIJmVbVYqFVSKWPzoX7kdP0yN9wIzSFxZfTwfhU8ebsN409xRg1PsW_8cvNiWzxDNm9wv9yBf9yK6nkEm-bOx2S0kBLbg9WfIChWzZrkpE3AHU9I-c2ZRH_IN-UF4g_g0_Dh4qE1VETs7tZTfKd1ox1MtBmeyKf7EKUCd3ezR9EbI5tK4hCU5qW4v5WWOxOrD17e8yCVmob27xz84Lr3bCK5wIQdH5VzFRTtyaAhudpt9E/http%3A%2F%2Flists.hdfgroup.org%2Fmailman%2Flistinfo%2Fhdf-forum_lists.hdfgroup.org
>>>>>>>         Twitter: https://twitter.com/hdf5
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------
>>>>>>>
>>>>>>>     _______________________________________________
>>>>>>>     Hdf-forum is for HDF software users discussion.
>>>>>>>     Hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>>
>>>>>>>
>>>>>>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>>>>>>>     Twitter: https://twitter.com/hdf5
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> netcdfgroup mailing list
>>>>>>> netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>>> For list information or to unsubscribe,  visit:
>>>>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> CF-metadata mailing list
>>> CF-metadata@xxxxxxxxxxxx
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>>
>>
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <http://mailman.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/attachments/20160422/f64faad2/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit: 
> http://www.unidata.ucar.edu/mailing_lists/
>
> End of netcdfgroup Digest, Vol 1126, Issue 2
> ********************************************



  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: