[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reading HDF5 data: memory issues



Hi Jon,

Looks like there is code to read just a subset of an HDF5 file so I'm not sure what is going on here. John Caron will be able to give a better answer but he is out of the office till Monday.

Ethan

Jon Blower wrote:

Hi all,

I have been trying to use the latest version of the Java NetCDF libraries
(2.2.10) to read data from some rather large HDF5 files.  I kept running out
of memory, and after further investigation I found the cause.  It seems
that, when using Variable.read() to get data from the file, *all* the data
from the variable is read into memory, no matter what the subset details
specified.  So read("0,0,0") will read in all the variable's data into
memory, then wrap it as an Array object with a logical size of one data
point.

If I remember correctly, this used to be the behaviour for NetCDF files too,
until the new version of the libraries.  It means that reading even small
subsets of data from large HDF5 files is very slow or impossible.  Is it
possible to read a subset of data from an HDF5 file using the NetCDF libs
without loading all the data into memory?

Thanks, Jon

--------------------------------------------------------------
Dr Jon Blower              Tel: +44 118 378 5213 (direct line)
Technical Director         Tel: +44 118 378 8741 (ESSC)
Reading e-Science Centre   Fax: +44 118 378 6413
ESSC                       Email: address@hidden
University of Reading
3 Earley Gate
Reading RG6 6AL, UK
--------------------------------------------------------------

--
Ethan R. Davis                                Telephone: (303) 497-8155
Software Engineer                             Fax:       (303) 497-8690
UCAR Unidata Program Center                   E-mail:    address@hidden
P.O. Box 3000
Boulder, CO  80307-3000                       http://www.unidata.ucar.edu/
---------------------------------------------------------------------------