[Jeremy Beal wrote that he has large quantities of both spatially and
temporally irregular/sparse data that he needs to store and retrieve
efficiently in a platform independent manner, and wonders how best to
do this.]
One additional question comes to mind immediately:
Do you want fast selective random access?
Be sure of your answer: in many cases it can make an astounding
difference to the style of work you do. Fast selective random access
makes an enormous difference for analysis and visualization. (I've also
seen too many met and met-related) models built around sequential
files that have become vast conspiracies to manipulate a complex
shared state centered around the positions of a multiplicity of
sequential file pointers.)
If you don't need/want fast selective random access, then the
XDR'ed binary file is an acceptable solution. Otherwise, for sparse
data you need files with built-in indexing. HDF VSets are a partial
solution to this, provided you don't have very many time steps: they
have a doubly-linked list of index blocks interspersed with data blocks.
Be aware, though, that the overhead of sequential access to those index
blocks can kill you if you do have lots of time steps. If you have a
year's worth of hourly met observations stored this way and you want
to look at the 0Z Dec 1 observations, be prepared to sit for five or
ten minutes while your disk drive grinds through the 8000 or so index
blocks for Jan 1-Dec 1 before it can even begin to think about data.
Something else worth checking is PDB, which is part of Livermore's
Portable Application Code Toolkit; see
http://www.llnl.gov/def_sci/pact/hact_homepage.html
It seems to be a lower-level interface than netCDF, but does have support
for building efficient index structures.
fwiw
xcc@xxxxxxxxxxxxxxxx
Carlie J. Coats, Jr. coats@xxxxxxxx
MCNC Environmental Programs phone: (919)248-9241
North Carolina Supercomputing Center fax: (919)248-9245
3021 Cornwallis Road P. O. Box 12889
Research Triangle Park, N. C. 27709-2889 USA
"My opinions are my own, and I've got *lots* of them!"