Denis,
I think you have a very valid concern there. However, in practice, I think
there are already a number of very important climate data sets which
contain data in multiple files, including the data that describes the
coordinate variables.
One valid reason for this is to optimize IO performance in high performance
computing applications, such as climate models. Due to the volume and
complexity of some of the coordinate data, storing it in every file may
have a significant storage and performance cost.
So I think that the use of external netCDF variables (a.k.a. HDF5 datasets)
is a worthwhile addition to the standard, as it may provide a standardized
way to accomplish what is already being done according to a variety of
local standards and practices.
I suspect that most users will understand that storing data in multiple
files carries additional risks, such as you mention. So such a capability
should be used sparingly. But when it is needed, then it would be good if
there were a standard way of doing it.
Thanks,
Ed
On Fri, Jul 15, 2016 at 4:53 AM, Julian Kunkel <juliankunkel@xxxxxxxxxxxxxx>
wrote:
> Dear Tim,
> I think an extension to HDF5 is possible to include an URI where the
> file can be fetched automatically when the file does not exist on the
> local system, yet.
> Additionally, some information to ensure consistency (e.g. checksum)
> when trying to open an external file should probably be included in
> the attributes (optionally).
>
> I'm curious to understand the space (bandwidth) savings that you may
> have using such a feature?
> Could you quantify it (approximately)?
>
> To another response:
> I believe the one file semantics should go away (anyway) in the long
> term to allow to query data that is scattered on multiple files. i.e.,
> you open once multiple dataset by changing the file name, the system
> then shows all the variables as they would belong to this virtual
> file.
> I consider this to be an intermediate step that offers transparent
> access to such a collection when it has been defined a-priori.
>
> Thanks for the feedback & regards,
> Julian
>
> https://www.hdfgroup.org/HDF5/Tutor/vds.html
>
> https://www.hdfgroup.org/HDF5/docNewFeatures/VDS/HDF5-VDS-requirements-use-cases-2014-12-10.pdf
>
> > On 07/14/2016 09:32 AM, Timothy Patterson wrote:
> >>
> >> We have a number of operational products based on fixed lat/lon grids
> that
> >> we disseminate in near-real time.
> >>
> >> The ability to be able to send the lat/lon grid once and link to it as a
> >> coordinate variable from within the file would be very useful as it
> would
> >> save considerably on bandwidth costs while still keeping the products
> >> user-friendly.
> >>
> >> So this would be a welcome development for our purposes.
> >>
> >> Tim
> >>
> >>
> >>
> _________________________________________________________________________________________
> >> Dr. Tim Patterson
> >> Instrument Data Simulation Expert
> >> Product Engineering/Test Data Coordination
> >> MTG Programme
> >> GEO Division
> >>
> >> EUMETSAT
> >> Eumetsat-Allee 1
> >> 64295 Darmstadt
> >> Germany
> >>
> >> Tel: +49 6151 807 487
> >> Fax: +49 6151 807 7
> >> E-mail: timothy.patterson@xxxxxxxxxxxx
> >> Web: www.eumetsat.int
> >>
> >>
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: netcdfgroup-bounces@xxxxxxxxxxxxxxxx
> >> [mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Eugen Betke
> >> Sent: Wednesday, July 13, 2016 1:09 PM
> >> To: netcdfgroup@xxxxxxxxxxxxxxxx
> >> Subject: [netcdfgroup] NetCDF external links
> >>
> >> Dear NetCDF-Group,
> >>
> >> we have been working on NetCDF external link functionality. This allows
> >> NetCDF applications to create dimension variables which values are
> stored in
> >> an external file. Therefore, it uses the HDF5 virtual dataset
> >> (VDS) functionality. This is useful for, e.g., climate applications that
> >> rely on a variable per file and timestep configuration. The idea is to
> store
> >> the grid in a separate file and link our data to this grid. We already
> have
> >> our first working version. You find the patch and the examples on our
> page:
> >>
> >>
> >>
> http://wr.informatik.uni-hamburg.de/research/projects/bullio/netcdf_external_links/start
> >>
> >> Under the hood it uses HDF5 virtual datasets. VDS has the advantage of
> >> being compatible to the functions that are supported by oridinary
> datasets.
> >> Therefore, files containing VDS should be supported by the most
> software .
> >>
> >> There is a minor issue related to HDF5, the call H5F_try_close function
> >> fails, when ncdump trys to read data from an external dimension. So far
> we
> >> found a workaround, but we will fix this issue.
> >>
> >> It would be great if external link functionality could be supported by
> >> netCDF at some timepoint. We would like to improve our patch and for
> that
> >> reason we need your feedback. If you have some idea to the issue above,
> we
> >> would be grateful for each hint.
> >>
> >> Regards,
> >> Eugen
> >>
> >> _______________________________________________
> >> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded
> >> in the Unidata inquiry tracking system and made publicly available
> through
> >> the web. Users who post to any of the lists we maintain are reminded to
> >> remove any personal information that they do not want to be made public.
> >>
> >>
> >> netcdfgroup mailing list
> >> netcdfgroup@xxxxxxxxxxxxxxxx
> >> For list information or to unsubscribe, visit:
> >> http://www.unidata.ucar.edu/mailing_lists/
> >>
> >> Any email message from EUMETSAT is sent in good faith but shall neither
> be
> >> binding nor construed as constituting a commitment by EUMETSAT, except
> where
> >> provided for in a written agreement or contract or if explicitly stated
> in
> >> the email. Please note that any views or opinions presented in this
> email
> >> are solely those of the sender and do not necessarily represent those of
> >> EUMETSAT. This message and any attachments are intended for the sole
> use of
> >> the addressee(s) and may contain confidential and privileged
> information .
> >> Any unauthorised use, disclosure, dissemination or distribution (in
> whole or
> >> in part) of its contents is not permitted. If you received this message
> in
> >> error, please notify the sender and delete it from your system.
> >
> >
> >
> > --
> > Eugen Betke
> > Abteilung Forschung
> > Deutsches Klimarechenzentrum GmbH (DKRZ)
> > Bundesstraße 45a • D-20146 Hamburg • Germany
> >
> > Phone: +49 40 460094-146
> > Fax: +49 40 460094-270
> > E-mail: betke@xxxxxxx
> > URL: http://www.dkrz.de
> >
> > Geschäftsführer: Prof. Dr. Thomas Ludwig
> > Sitz der Gesellschaft: Hamburg
> > Amtsgericht Hamburg HRB 39784
> >
>
>
>
> --
> http://wr.informatik.uni-hamburg.de/people/julian_kunkel
>
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web. Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/
>