my thought was to make a netcdfJSON, then add features to make an
hdfJSON. (and netcdfJSON would look a lot like CDL)
So a netcdfJSON file would be a valid hdfJSON file, but not the other
way around.
on better thinking , this design has the problem of netCDF having things
that HDF5 does not (named dimensions),
and HDF5 has things that netCDF does not, so it's a bit of a catch 22 ;
so maybe just keep them separate
my design method is usually a bit of specification , then a bit of code
, then when something new comes up that was not planned, go to step 1 ,
and re-write the spec, sometimes re-write the code
-Pedro
----- Original Message -----
*From:* Pedro Vicente <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>
*To:* Chris Barker <mailto:chris.barker@xxxxxxxx>
*Cc:* HDF Users Discussion List
<mailto:hdf-forum@xxxxxxxxxxxxxxxxxx> ; netCDF Mail List
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
*Sent:* Thursday, October 20, 2016 7:33 PM
*Subject:* Re: [netcdfgroup] How to dump netCDF to JSON?
>>my thought was to make a netcdfJSON, then add features to make an
hdfJSON. (and netcdfJSON would look a lot like CDL)
>>So a netcdfJSON file would be a valid hdfJSON file, but not the
other way around.
yes, sounds like a good plan
I''ll send you an email when I have things ready, thanks
-Pedro
----- Original Message -----
*From:* Chris Barker <mailto:chris.barker@xxxxxxxx>
*To:* Pedro Vicente <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>
*Cc:* John Readey <mailto:jreadey@xxxxxxxxxxxx> ; netCDF Mail
List <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; HDF Users
Discussion List <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>
*Sent:* Thursday, October 20, 2016 6:17 PM
*Subject:* Re: [netcdfgroup] How to dump netCDF to JSON?
On Thu, Oct 20, 2016 at 3:00 PM, Pedro Vicente
<pedro.vicente@xxxxxxxxxxxxxxxxxx
<mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>> wrote:
__
>>> This is making me think that we may want a spec for netcdf-json
that would be a subset of
the hdf-json spec.
that is one option;
other option is to make a JSON form of netCDF CDL ,
completely unaware of HDF5 (just like the netCDF API is)
http://www.unidata.ucar.edu/software/netcdf/workshops/2011/utilities/CDL.html
<http://www.unidata.ucar.edu/software/netcdf/workshops/2011/utilities/CDL.html>
yup.
Are they mutually exclusive approaches? my thought was to make a
netcdfJSON, then add features to make an hdfJSON. (and
netcdfJSON would look a lot like CDL)
So a netcdfJSON file would be a valid hdfJSON file, but not the
other way around.
Like a netcdf4 file is a valid hdf5 file now.
-CHB
with the "data" part being optional, which was one of the
goals of my design, to transmit just metadata over the web,
for a quick remote inspection
-Pedro
----- Original Message -----
*From:* Chris Barker <mailto:chris.barker@xxxxxxxx>
*To:* John Readey <mailto:jreadey@xxxxxxxxxxxx>
*Cc:* Pedro Vicente
<mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx> ; netCDF Mail
List <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; HDF Users
Discussion List <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>
*Sent:* Thursday, October 20, 2016 4:48 PM
*Subject:* Re: [netcdfgroup] How to dump netCDF to JSON?
On Thu, Oct 20, 2016 at 12:02 PM, John Readey
<jreadey@xxxxxxxxxxxx <mailto:jreadey@xxxxxxxxxxxx>> wrote:
So we came up with a scheme of Group, Dataset, and
Datatype collections with a UUID to identify each
object. That way if you a reference to a specific
UUID, you can always access the object regardless of
what shenanigans may be happening with the links in
the file.
____
It’s true that this makes path look ups a bit more
cumbersome, but it’s a more general way of specify a
directed graph (the HDF5 link structure) on a tree
(the JSON hierarchy).
Hmm -- interesting. I hadn't realized that HDF was this
flexible. For my part, I've only really used netcdf.
This is making me think that we may want a spec for
netcdf-json that would be a subset of the hdf-json spec.
That way they can be as compatible as possible without
"cluttering up" the netcdf spec too much.
-CHB
John____
____
*From: *Pedro Vicente
<pedro.vicente@xxxxxxxxxxxxxxxxxx
<mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>>
*Date: *Tuesday, October 18, 2016 at 9:37 PM
*To: *John Readey <jreadey@xxxxxxxxxxxx
<mailto:jreadey@xxxxxxxxxxxx>>, Chris Barker
<chris.barker@xxxxxxxx <mailto:chris.barker@xxxxxxxx>>
*Cc: *netCDF Mail List <netcdfgroup@xxxxxxxxxxxxxxxx
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>>, HDF Users
Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx
<mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>
*Subject: *Re: [netcdfgroup] How to dump netCDF to
JSON?____
____
@John____
____
>> 1. Complete fidelity to all HDF5 features____
>> 2. Support graphs that are not acyclic.____
____
ok, understood.____
____
In my case I needed a simple schema for a particular
set of files.____
____
But why didn't you start with the official HDF5 DDL____
____
https://support.hdfgroup.org/HDF5/doc/ddl.html
<https://support.hdfgroup.org/HDF5/doc/ddl.html>____
____
and try to adapt to JSON?____
____
Same thing for netCDF, there is already an official
CDL, so any JSON spec should be "identical".____
____
____
____
@Chris____
____
{
"dset1" : ["dataset", "STAR_INT32", 2, [3, 4], [1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]]
}____
____
>> * Do you need "rank"? ____
____
sometimes a bit of redundancy is useful, to make it
visually clear____
____
>> BTW, is a "dataset" in HDF the same thing as a
"variable" in netcdf?)____
____
yes____
____
>>It would be really great to have this become an
"official" spec -- if
you want to get it there, you're probably going to
need to develop it more out in the open with a wider
community. These lists are the way to get that
started, but I suggest ____
>>1) put it up somewhere that people can collaborate on it,
make
suggestions, capture the discussion, etc. gitHub is
one really nice way to do that. See, for example the
UGRID spec project: ____
____
____
ok, anyone interested send me an off list email ____
____
____
-Pedro____
____
____
____
----- Original Message ----- ____
*From:*John Readey <mailto:jreadey@xxxxxxxxxxxx>
____
*To:*Chris Barker <mailto:chris.barker@xxxxxxxx>
; Pedro Vicente
<mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx> ____
*Cc:*netCDF Mail List
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; Charlie
Zender <mailto:zender@xxxxxxx> ; HDF Users
Discussion List
<mailto:hdf-forum@xxxxxxxxxxxxxxxxxx> ; David
Pearah <mailto:David.Pearah@xxxxxxxxxxxx> ____
*Sent:*Tuesday, October 18, 2016 11:15 PM____
*Subject:*Re: [netcdfgroup] How to dump netCDF
to JSON?____
____
Hey,____
____
The hdf5-json code is here:
https://github.com/HDFGroup/hdf5-json
<https://github.com/HDFGroup/hdf5-json> and docs
are here:
http://hdf5-json.readthedocs.io/en/latest/
<http://hdf5-json.readthedocs.io/en/latest/>. ____
____
The package is both a library of HFD5 <-> JSON
conversion functions and some simple scripts for
converting HDF5 to JSON and vice-versa. E.g. ____
$ python h5tojson.py –D <hdf5-file> ____
outputs JSON minus the dataset data values.____
____
While it may not be the most elegant JSON
schema, it’s designed with the following goals
in mind:____
1. Complete fidelity to all HDF5 features
(i.e. the goal is that you should be able to
take any HDF5 files, convert it to JSON, convert
back to HDF5 and wind up with a file that is
semantically equivalent to what you started
with.____
2. Support graphs that are not acyclic.
I.e. a group structure like <root> links with A,
and B. And A and B links to C. The output
should only produce one representation of C.____
Since NetCDF doesn’t use all these features,
it’s certainly possible to come up with
something simpler for just netCDF files.____
____
Suggestions, feedback, and pull requests are
welcome!____
____
Cheers,____
John____
____
*From: *Chris Barker <chris.barker@xxxxxxxx
<mailto:chris.barker@xxxxxxxx>>
*Date: *Friday, October 14, 2016 at 12:32 PM
*To: *Pedro Vicente
<pedro.vicente@xxxxxxxxxxxxxxxxxx
<mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>>
*Cc: *netCDF Mail List
<netcdfgroup@xxxxxxxxxxxxxxxx
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>>, Charlie
Zender <zender@xxxxxxx <mailto:zender@xxxxxxx>>,
John Readey <jreadey@xxxxxxxxxxxx
<mailto:jreadey@xxxxxxxxxxxx>>, HDF Users
Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx
<mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>, David
Pearah <David.Pearah@xxxxxxxxxxxx
<mailto:David.Pearah@xxxxxxxxxxxx>>
*Subject: *Re: [netcdfgroup] How to dump netCDF
to JSON?____
____
Pedro, ____
____
When I first started reading this thread, I
thought "there should be a spec for how to
represent netcdf in JSON"____
____
and then I read:____
____
1) The specification to convert netCDF/HDF5
to "a" JSON format (note the "a" here)____
____
Awesome -- that's exactly what we need -- as you
say there is not one way to represent netcdf
data in JSON, and probably far more than one
"obvious" way.____
____
Without looking at your spec yet, I do think it
should probably look as much like CDL as
possible -- we are all familiar with that.____
____
(why Python? HDF5 developer tools should be
all about writing in C/C++)____
____
Because Python is an excellent language with
which to "drive" C/C++ libraries like HDF5 and
netcdf4. If I were to do this, I'd sure use
Python. Even if you want to get to a C++
implementation eventually, you'd probably
benefit from prototyping and working out the
kinks with a Python version first.____
____
But whoever is writing the code....____
____
____
The specification is here
http://www.space-research.org/____
____
Just took a quick look -- nice start. ____
____
I've only used HDF through the netcdf4 spec, so
there may be richness needed that I'm missing,
but my first thought is to a make greater use of
"objects" in JSON (key-value structures, hash
tables, dicts in python), rather than array
position for heterogeneous structures. For
instance, you have:____
____
a dataset____
{
"dset1" : ["dataset", "STAR_INT32", 2, [3,
4], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]]
}____
____
I would perhaps do that as something like:____
____
{____
...____
"dset1":{"object_type": "dataset",____
"dtype": "INT32"____
"rank": 2,____
"dimensions": [3,4],____
"data": [[1,2,3,4],____
[5,6,7,8],____
[9,10,11,12]]____
}____
...____
}____
____
NOTES:____
____
* I used nested arrays, rather than flattening
the 2-d array -- this maps nicely to things like
numpy arrays, for example -- not sure about the
C++ world. (you can flatten and un-flatten numpy
arrays easily, too, but this seems like a better
mapping to the structure) And HDF is storing
this all in chunks and who knows what -- so it's
not a direct mapping to the memory layout
anyway.____
____
* Do you need "rank"? -- can't you check the
length of the dimensions array?____
____
* Do you need "object_type" -- will it always
be a dataset? Or you could have something like:____
____
{____
...____
"datasets": {"dset1": {the actual dataset
object},____
"dset2": {another dataset object},____
....____
} ____
____
Then you don't need object_type or a name____
____
____
(BTW, is a "dataset" in HDF the same thing as a
"variable" in netcdf?)____
____
I would like to make this some kind of
"official" netCDF/HDF5 JSON format for the
community, so I encourage anyone to read the
specification____
____
If you see any flaw in the design or
anything in the design that you would like
to have change please let me know now____
____
done :-)____
____
It would be really great to have this become an
"official" spec -- if you want to get it there,
you're probably going to need to develop it more
out in the open with a wider community. These
lists are the way to get that started, but I
suggest:____
____
1) put it up somewhere that people can
collaborate on it, make suggestions, capture the
discussion, etc. gitHub is one really nice way
to do that. See, for example the UGRID spec
project:____
____
https://github.com/ugrid-conventions/ugrid-conventions
<https://github.com/ugrid-conventions/ugrid-conventions>____
____
(NOTE that that one got put on gitHub after
there was a pretty complete draft spec, so there
isn't THAT much discussion captured. But also
note that that is too bad -- there is no good
record of the decision process that led to the
spec)____
____
At the moment it only (intentionally) uses
common generic features of both netCDF and
HDF5, which are the numeric atomic types and
strings.____
____
Good plan.____
____
-Chris____
____
____
-- ____
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959
<tel:%28206%29%20526-6959> voice
7600 Sand Point Way NE (206) 526-6329
<tel:%28206%29%20526-6329> fax
Seattle, WA 98115 (206) 526-6317
<tel:%28206%29%20526-6317> main reception
Chris.Barker@xxxxxxxx
<mailto:Chris.Barker@xxxxxxxx>____
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317
<tel:%28206%29%20526-6317> main reception
Chris.Barker@xxxxxxxx <mailto:Chris.Barker@xxxxxxxx>
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@xxxxxxxx <mailto:Chris.Barker@xxxxxxxx>
------------------------------------------------------------------------
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web. Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web. Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/