Re: [netcdfgroup] NetCDF4 for Fusion Data

Hi Ed

Further to my previous postings about 'unlimited' dimensions, I now understand
the semantics better, and it's apparent that there is a mismatch with the
needs of our application.

As previously described, we need to archive say 96 digitizer channels which
have the same sample times but potentially different sample counts. From a
logical point of view, the channel measurements share a single time
dimension - some move further along it than others, that's all. They should
clearly all reference a single time coordinate variable. Also, we may want to
stick with our present compression strategy for time, storing it as a
(sequence of) triple: start time, time increment, and count. We might put
these values in an attribute of the time coordinate variable, leaving the
variable itself empty. Potentially all the variables might have different
initialized sizes.

The 'unlimited' semantics go only half way to matching this requirement. At
the HDF5 storage level,all is well. H5dump shows that the stored size of each
variable is the initialized size, not the maximum initialized size of all the
variables to which the dimension is evidently set. So far so good, but ncdump
shows all the data padded to that size, reducing its usefulness. This is
presumably because the dimension provides the only size exposed by the API,
unless I overlook something. HDF5 knows about the initialized sizes, but
NetCDF doesn't expose them. So we cannot easily read the data and nothing but
the data. Do you have an initialized size inquiry function tucked away
somewhere, or do we have to store the value as an attribute with each
variable?

I don't think I want to explore VLEN to crack this, because it's new and would
complicate things. It seems to me that this is a use case others will
encounter, which needs a tidy solution.Any thoughts? I have to present a
strong case for NetCDF here next week, to counter an HDF5 proposal which
doesn't have this problem, though it has many others.

Another point: nc_inq_ncid returns NC_NOERR if the named group doesn't exist.
Do you mean this?

Regards
John

On Tuesday 20 January 2009, John Storrs wrote:
> I've uncovered a couple of problems:
>
> (1) Variables in an explicitly defined group, using the same 'unlimited'
> dimension but of different initialized sizes, result in an HDF error when
> ncdump is run (without flags) on the generated NetCDF4 file. No problems
> are reported when the file is generated (all netcdf call return values are
> checked in the usual way). The dimension is defined in the root group. Try
> writing data of size S to one variable, and size < S to the next. This
> error isn't seen if the variables are all in the root group. In that case,
> ncdump fills all variables to the maximum size which I suppose is a feature
> and not a bug. An ncdump flag to disable this feature would be useful.


--
John Storrs, Experiments Dept      e-mail: john.storrs@xxxxxxxxxxxx
Building D3, UKAEA Fusion                               tel: 01235 466338
Culham Science Centre                                    fax: 01235 466379
Abingdon, Oxfordshire OX14 3DB              http://www.fusion.org.uk




  • 2009 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: