Re: Performance problem with large files

Martin Dix wrote:
> 
> hinsen@xxxxxxxxxxxxxxxxxxxxx writes:
> 
>  > ... The data in the files is essentially
>  > one single-precision float array of dimensions 8000 x 3 x 16000, the
>  > last dimension being declared as "unlimited". I read and write
>  > subarrays of shape 1 x 3 x 16000. ...
> 
> For simplicity call the unlimited dimension t. A netcdf file stores
> all the data for t=1, then for t=2 etc. Your description of the
> array indices means that each subarray is scattered through the
> entire file and requires accessing almost every file block. Things
> should be a lot better if you write subarrays of 8000 x 3 x 1 or if
> you can't do this, rearrange the file so that the 8000 dimension is
> unlimited rather than the 16000 dimension.
> 

Every time you write data with unlimited dimensions the data isn't block
written.

e.g.
DATA1 1,2,3,4,5
DATA2 10,20,30,40,50

result in file 

1,10,2,20,3,30,4,50


If you now read in DATA1 the whole file must be read.

In some cases this is much slower by reading instead of using limited
dimensions.

If you are using limited dimensions the result in file is
1,2,3,4,5,10,20,30,40,50


Then by read only small amounts of the file must be read.

regards
Reimar






> Martin Dix
> 
> CSIRO Atmospheric Research                Phone: +61 3 9239 4533
> Private Bag No. 1, Aspendale                Fax: +61 3 9239 4444
> Victoria 3195  Australia                  Email: martin.dix@xxxxxxxxxxxx

-- 
Reimar Bauer 

Institut fuer Stratosphaerische Chemie (ICG-1)
Forschungszentrum Juelich
email: R.Bauer@xxxxxxxxxxxxx
http://www.fz-juelich.de/icg/icg1/
=================================================================
a IDL library at ForschungsZentrum Juelich
http://www.fz-juelich.de/icg/icg1/idl_icglib/idl_lib_intro.html

http://www.fz-juelich.de/zb/text/publikation/juel3786.html
=================================================================

read something about linux / windows
http://www.suse.de/de/news/hotnews/MS.html

  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: