> Organization: IRCAM, Centre Georges Pompidou
> Keywords: 199501051237.AA24460
Hi Gerhard,
> Thank you for the answers I got concerning my two questions. My problems are
> not solved yets but I am now able to state the questions in a clearer
> fashion now:
>
> 1) For our applications we would need to read and write netCDF file from
> stdin and to stdout. Since netCDF routines may do seeks, this will
> probably cause problems. Anyway, at the moment there is no way to open a
> netCDF file via stdin because the file has to have a name.
First, apologies for not providing faster answers to your questions. These
are good questions that should probably be added to our "Frequently Asked
Questions" list.
The inability to use a "pipes and filters" approach directly with netCDF was
intentional to get the corresponding benefits of an engineering trade-off
between direct access and sequential access. Direct access permits
efficient access to a small subset of a large dataset without reading
through all the preceding data. Sequential access permits simple connection
via pipes, but is inefficient for accessing small amounts of data from a
large file.
As Steve Emmerson has pointed out, a feature of the first release of the
netCDF operators we developed is that they can read an input netCDF data set
from standard input and can also write an output netCDF data set to standard
output. Thus, these operators can be used in UNIX pipelines (although this
is only recommended for small data sets). This was implemented on input by
copying standard input to a temporary file and opening it as a netCDF file,
and similarly on output by copying a temporary output file to standard
output. A similar approach can be used with any program that deals with
netCDF data, providing the advantages of direct access when file names are
provided for input or output, but using standard input or output via
transparent copying otherwise.
> 2) In our applications we need more than one unlimitted dimension. As far as
> I see netCDF allows only one.
That's right, and the reason for that limitation is again an engineering
trade-off, to provide efficient access to cross-sections of data. We know
of no implementation that permits multiple unlimited dimensions, efficient
access to orthogonal cross-sections of data, and the ability to later append
data along any of the unlimited dimensions.
> Will there be changes in the future solving the two problems or could we
> solve them ourselves? Do others have these problems? Is anybody from Unidata
> listening?
The first limitation is one you can get around yourself by the strategy I
have described, if you are willing to accept the performance penalty of
an extra copy of the input or output files.
We have investigated ways to remove the restriction on a single unlimited
dimension, but these appear to require adding other new restrictions to
netCDF data access. For example, by restricting the writing of
multidimensional arrays to occur in the same order as they are stored,
rather than in arbitrary order as is now allowed, multiple unlimited
dimensions might be (weakly) supported. Another approach requires garbage
collection, to recover unused locations so that a file doesn't grow
quadratically in size as a single vector is extended linearly along an
unlimited dimension.
A workaround when you need multiple unlimited dimensions in a single file is
to use multiple files with separate unlimited dimensions in each, but this
only works if no single variable uses more than one of the desired unlimited
dimensions.
--
Russ Rew UCAR Unidata Program
russ@xxxxxxxxxxxxxxxx P.O. Box 3000
http://www.unidata.ucar.edu/ Boulder, CO 80307-3000