Hello all,
In fact there is already an accepted enhancement in CF convention, for
this kind of feature on CMIP6 datasets.
"Subconvention for associated files, proposed for use in CMIP6"
http://cf-trac.llnl.gov/trac/ticket/145
to save storage space, where redundant information are stored in
separated files.
This new feature would be great.
Regards
Antonio
On 22/02/17 15:44, Ed Hartnett wrote:
I echo Eugen's call for a way to handle multi-file datasets in netCDF.
I understand and applaud the netCDF ideal that metadata and data
belong in the same file, but that ideal cannot always be accommodated,
and many of those who have the greatest need for separate coordinate
data files are in a netCDF core community of weather and climate modeling.
The features in this proposal look very attractive.
Keep on netCDFing!
Ed
On Wed, Feb 22, 2017 at 5:33 AM, Eugen Betke <betke@xxxxxxx
<mailto:betke@xxxxxxx>> wrote:
Dear NetCDF-Group,
about half a year ago we discussed the integration of external
links in NetCDF.
Motivation:
In our institution, people are already working with multiple data
files (grid and data separated) to avoid replication of the grid
when a file only contains one timestep.
Here is a short summary of last discussion:
1. Our implementation of external links is based on HDF5 Virtual
Datasets (VDS).
It allows to use a variable defined in another file as one of the
dimensions.
2. Possible application fields are data deduplication and I/O
optimization.
- When data and grid are stored in separate files, grid can be
reused. No duplication of the grid is necessary.
- I/O optimization is achieved, through saving of storage space
and network bandwidth.
3. Until now, there was an implicit assumption, that NetCDF files
must be self-contained, i.e., all data must be stored in one
single file.
4. This feature is not mandatory nor does it change anything
inside the regular NetCDF4 file format. It can be used when necessary.
5. Storage of data in multiple files has been discussed:
- What happens if one file is missing?
The conclussion was, that the file is still valid, because in that
case the default values will be used, but the data file is useless
for the application, because the data can not be interpreted.
- Are all files (data and grid files) valid NetCDF4 files?
The files using links are not backwards compatible.
6. We believe the single file semantic must go away in the long
term, where this approach is an intermediate step.
We would like to see this feature to be added to NetCDF standard.
We can provide a patch for configure to include support only when
the required HDF5 version is available.
Is there anything else necessary to help in integrating this
feature into NetCDF:
- Do we need better understanding of saving data in multiple files?
- Shall we provide a well tested and documented implementation?
- How large must the number of intrested people be, in order to
justify the integration this feature?
You find a patch on our website:
http://wr.informatik.uni-hamburg.de/research/projects/bullio/netcdf_external_links/start
<http://wr.informatik.uni-hamburg.de/research/projects/bullio/netcdf_external_links/start>
We would like to reopen the discussion.
Please provide a clear rejection, if for some reason this feature
can't never be a part of NetCDF.
Regards,
Eugen
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web. Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/
<http://www.unidata.ucar.edu/mailing_lists/>
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web. Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.
netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit:
http://www.unidata.ucar.edu/mailing_lists/