Re: [netcdfgroup] Problem with nc_inq_grps and ncdump reading HDF5file

Hi Andrew:

Its unlikely that having that many groups is a good design, although there
might be exceptions. What are you using them for?

John

On Mon, Nov 17, 2014 at 5:18 PM, Pedro Vicente <
pedro.vicente@xxxxxxxxxxxxxxxxxx> wrote:

>   Andrew
>
> If you are writing a new format, my guess is that you are writing an API
> for it, and since I have been writing these High-Level APIs for years
> (starting with the very own HDF5 High-Level API), and I am writing one now,
> here's my advice for it :-)
>
> 1) Do not use groups as a means to access your data (more on this later).
> 2) Encapsulate all HDF5 IDs.
>
>
> For my case, I am using a model where datasets are accessed for read/write
> using their full path name, e.g "/path/to/dataset".
>
> So, "/path/to/" is the group name  and "dataset" is the dataset relative
> name.
>
> API usage for read/write is
>
> write( "/path/to/dataset", void * data buffer)
>
> so, the model to read/write is just the full dataset name as a string and
> a buffer with data.
>
> The group part is implicit in the string, as you can see no HDF5 group IDs
> or group paths to read/write (that is what I mean with
> "not use groups as a means to access your data ")
>
>
> How can you accomplish this?
>
> Here's an example of functions part of a C++ class , regarding this part
>
> void create_group(const std::string& group_name);
>
> void
> create_dataset(const std::string& group_path, const std::string &
> dataset_name, other parameters here like dataset size or chunk size
>
> void
> write(const std::string& path, const void* buf);
>
> Let's start with the create_group function
>
>
> Here just create the group using the HDF5 C API (I use the C API even in a
> C++ program, the HDF5 C++ API adds nothing of use here)
>
>
> gid = H5Gcreate2(this->m_fid, group_name.c_str(), other parameters
> H5Gclose(gid)
> As can be seen the group is created and closed immediately.
>
> This means, you will have no groups open at all during your read/write
> calls (typically the core part of a program execution).
>
> Since groups are closed immediately, you can even exceed the number of
> maximum open groups Russ mentioned.
>
> For the dataset creation
>
>
> create_dataset(
> const std::string& group_path, const std::string& dataset_name,
> concatenate the group path with the dataset name and store it somehow in
> your C++ class
>
>
>
> std::string absolute_dataset_name =
> group_path + "/" + dataset_name;
> In my case , I am using a map with dataset ID/ full path of dataset,
> because I want to keep the datasets open (in memory), but any other way to
> store the path will do
>
>
> //map with dataset name (full path) / HDF5 dataset ID
>
> std::
> map<std::string, hid_t> m_map_datasets;
>
>
> here is the HDF5 create call
>
>
> did = H5Dcreate2(
> this->m_fid, absolute_dataset_name.c_str()
>
> The trick here is to use the file ID (that must be stored in the class)
> and the full path. This takes group IDs out of the equation.
>
> For the write/read call use simply
>
>
> write(
> const std::string& path, const void* buf)
>
>
>
> get the dataset ID from full path
>
> //get dataset ID from map <path/ID>
>
> hid_t did = this->m_map_datasets[path];
>
> use it in the write call
>
> H5Dwrite(did, other parameters
>
>
>
> Hope this helps and ask any further questions if needed.
>
>
>
> btw, I downloaded your file with only groups, it has a size of 37MB, lots
> of group metadata here.
>
> -Pedro
>
> ----------------------
> Pedro Vicente
> pedro.vicente@xxxxxxxxxxxxxxxxxx
> http://www.space-research.org/
>
> ----- Original Message -----
> *From:* Russ Rew <russ@xxxxxxxxxxxxxxxx>
> *To:* Andrew Dowsey <andrew.dowsey@xxxxxxxxxxxxxxxx>
> *Cc:* netcdfgroup@xxxxxxxxxxxxxxxx
> *Sent:* Monday, November 17, 2014 2:11 PM
> *Subject:* Re: [netcdfgroup] Problem with nc_inq_grps and ncdump reading
> HDF5file
>
> Hi Andrew,
>
> You've run across a limitation in the number of simultaneously open groups
> permitted in netCDF-4.  The groups documentation
> <http://www.unidata.ucar.edu/netcdf/docs/group__groups.html> says
>
>  ... Encoding both the open file id and group id in a single integer
> currently limits the number of groups per netCDF-4 file to no more than
> 32767. Similarly, the number of simultaneously open netCDF-4 files in one
> program context is limited to 32767.
>
>
> I think both those limits should actually be 65535 (== 2**16 - 1), but in
> any case, your HDF5 file has 119254 groups, which is too many for netCDF-4
> to handle.
>
> The only workaround I can think of would be to close some groups if you
> don't need to have them all open simultaneously.
>
> --Russ
>
>
> On Mon, Nov 17, 2014 at 8:14 AM, Andrew Dowsey <
> andrew.dowsey@xxxxxxxxxxxxxxxx> wrote:
>
>>  Hi,
>>
>> I’m trying to create HDF5 files that can be read by NetCDF4 and I’ve run
>> into a problem in that nc_inq_grps seems to report some bad ids. ncdump
>> bails with this error too. h5dump works fine. The problem is deterministic
>> but I haven’t been able to figure out what causes it because slightly
>> different HDF5 files work fine. I have created a test file that has this
>> problem, which contains nothing but groups. It can be downloaded from
>> http://personalpages.manchester.ac.uk/staff/andrew.dowsey/test.h5
>>
>> I am creating a new format for a type of instrument data we use, and for
>> flexibility I would like it to be writeable/readable both by HDF5 and
>> netCDF4 libraries.
>>
>> Any insight would be greatly appreciated!
>>
>> Kind regards,
>> Andy
>>
>>
>>  *Andrew Dowsey PhD*
>>   *Lecturer and CADET Bioinformatics Research Lead*
>> Institute of Human Development, The University of Manchester
>>
>> t: +44 161 701 0244
>> f: +44 161 701 0242 <%2B44%20161%20701%200242>
>> http://www.manchester.ac.uk/research/andrew.dowsey
>>
>> Centre for Advanced Discovery and Experimental Therapeutics (CADET)
>> Central Manchester University Hospitals NHS Foundation Trust
>> Oxford Road
>> Manchester M13 9WL
>> UK
>>
>>
>> _______________________________________________
>> netcdfgroup mailing list
>> netcdfgroup@xxxxxxxxxxxxxxxx
>> For list information or to unsubscribe,  visit:
>> http://www.unidata.ucar.edu/mailing_lists/
>>
>
>  ------------------------------
>
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> http://www.unidata.ucar.edu/mailing_lists/
>
>
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> http://www.unidata.ucar.edu/mailing_lists/
>
  • 2014 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: