Re: [netcdfgroup] NetCDF external links

I see two levels of transparency that need discussion.
1. transparency at the API level. Do we need to modify
    nc_open to specify all the relevant files?
2. transparency at the hdf5 library level. If it is the case
   that the hdf5 API is the same for files with links, then
   of course, the existing netcdf-c library would be able to
   read such files with no changes whatsoever.
Without having looked in detail. I am guessing that the HDF5 API
is not transparent wrt to linked files. Correct?
=Dennis Heimbigner
 Unidata

p.s. I have elsewhere suggested providing a single-file-filesystem
model for netcdf-c files and this, or some equivalent such as .zip
files would help mitigate the use of multiple files.


On 2/22/2017 12:45 PM, Julian Kunkel wrote:
Dear Dave,
exactly that transparency is the idea. Normal netcdf applications don't
realize that there is any difference.

Regards
Julian

Am 22.02.2017 8:31 nachm. schrieb "Dave Allured - NOAA Affiliate"
<dave.allured@xxxxxxxx <mailto:dave.allured@xxxxxxxx>>:

    Eugene,

    For reading only, how transparent are HDF5 virtual data sets as
    single Netcdf-4 files?  Is it now possible to have a VDS that can be
    fully and transparently accessed by the Netcdf-C API, with the
    appearance of a single Netcdf-4 file for all normal read-only purposes?

    --Dave


    On Wed, Feb 22, 2017 at 5:33 AM, Eugen Betke <betke@xxxxxxx
    <mailto:betke@xxxxxxx>> wrote:

        Dear NetCDF-Group,

        about half a year ago we discussed the integration of external
        links in NetCDF.
        Motivation:
        In our institution, people are already working with multiple
        data files (grid and data separated) to avoid replication of the
        grid when a file only contains one timestep.

        Here is a short summary of last discussion:
        1. Our implementation of external links is based on HDF5 Virtual
        Datasets (VDS).
        It allows to use a variable defined in another file as one of
        the dimensions.
        2. Possible application fields are data deduplication and I/O
        optimization.
        - When data and grid are stored in separate files, grid can be
        reused. No duplication of the grid is necessary.
        - I/O optimization is achieved, through saving of storage space
        and network bandwidth.
        3. Until now, there was an implicit assumption, that NetCDF
        files must be self-contained, i.e., all data must be stored in
        one single file.
        4. This feature is not mandatory nor does it change anything
        inside the regular NetCDF4 file format. It can be used when
        necessary.
        5. Storage of data in multiple files has been discussed:
        - What happens if one file is missing?
        The conclussion was, that the file is still valid, because in
        that case the default values will be used, but the data file is
        useless for the application, because the data can not be
        interpreted.
        - Are all files (data and grid files) valid NetCDF4 files?
        The files using links are not backwards compatible.
        6. We believe the single file semantic must go away in the long
        term, where this approach is an intermediate step.

        We would like to see this feature to be added to NetCDF standard.
        We can provide a patch for configure to include support only
        when the required HDF5 version is available.
        Is there anything else necessary to help in integrating this
        feature into NetCDF:
        - Do we need better understanding of saving data in multiple files?
        - Shall we provide a well tested and documented implementation?
        - How large must the number of intrested people be, in order to
        justify the integration this feature?

        You find a patch on our website:
        
http://wr.informatik.uni-hamburg.de/research/projects/bullio/netcdf_external_links/start
        
<http://wr.informatik.uni-hamburg.de/research/projects/bullio/netcdf_external_links/start>

        We would like to reopen the discussion.
        Please provide a clear rejection, if for some reason this
        feature can't never be a part of NetCDF.

        Regards,
        Eugen


    _______________________________________________
    NOTE: All exchanges posted to Unidata maintained email lists are
    recorded in the Unidata inquiry tracking system and made publicly
    available through the web.  Users who post to any of the lists we
    maintain are reminded to remove any personal information that they
    do not want to be made public.


    netcdfgroup mailing list
    netcdfgroup@xxxxxxxxxxxxxxxx <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>
    For list information or to unsubscribe,  visit:
    http://www.unidata.ucar.edu/mailing_lists/
    <http://www.unidata.ucar.edu/mailing_lists/>



_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/




  • 2017 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: