Hi Jon:
The need to read the entire variable I think only happens when the data is
compressed, then you have no choice but to uncompress the whole thing and then
subset it. Can you check to see if that's whats happening in your case? I think
an hdfdump will tell you if the variable is compressed.
Thanks,
John
Jon Blower wrote:
Hi all,
I have been trying to use the latest version of the Java NetCDF libraries
(2.2.10) to read data from some rather large HDF5 files. I kept running out
of memory, and after further investigation I found the cause. It seems
that, when using Variable.read() to get data from the file, *all* the data
from the variable is read into memory, no matter what the subset details
specified. So read("0,0,0") will read in all the variable's data into
memory, then wrap it as an Array object with a logical size of one data
point.
If I remember correctly, this used to be the behaviour for NetCDF files too,
until the new version of the libraries. It means that reading even small
subsets of data from large HDF5 files is very slow or impossible. Is it
possible to read a subset of data from an HDF5 file using the NetCDF libs
without loading all the data into memory?
Thanks, Jon
--------------------------------------------------------------
Dr Jon Blower Tel: +44 118 378 5213 (direct line)
Technical Director Tel: +44 118 378 8741 (ESSC)
Reading e-Science Centre Fax: +44 118 378 6413
ESSC Email: jdb@xxxxxxxxxxxxxxxxxxxx
University of Reading
3 Earley Gate
Reading RG6 6AL, UK
--------------------------------------------------------------