Re: Performance problem with large files

Konrad Hinsen wrote:
> 
> Martin Dix <martin.dix@xxxxxxxxxxxx> wrote:
> 
> > For simplicity call the unlimited dimension t. A netcdf file stores
> > all the data for t=1, then for t=2 etc. Your description of the
> > array indices means that each subarray is scattered through the
> > entire file and requires accessing almost every file block. Things
> > should be a lot better if you write subarrays of 8000 x 3 x 1 or if
> > you can't do this, rearrange the file so that the 8000 dimension is
> > unlimited rather than the 16000 dimension.
> 
> But I must read along both major dimensions, depending on the type of
> analysis I am doing. From your explanation it seems that one the two
> access types will always be very slow. Shouldn't it be possible for
> the netCDF library to organize the data in such a way that a scan
> along any dimension is doable with acceptable efficiency? For example,
> each contiguous file block could correspond to a subarray of
> approximately equal extent along each dimension.
> 
> Could I gain anything from not using an unlimited dimension? In some
> cases I know the final size before creating the file, and in others
> it might be worth to make a fixed-size copy before some lengthy analysis.

Yes, the data is stored block orientated each variable as a block in the
file.
The access of the data should be faster. We always using limited
dimensions
because we read many times a once created file.

If you store multidimensional data you should organize the data that's
the counting dimension is the last one. Then the field itselfs
is best organized for reading parts. In your case you should store
as 16000 X 3 X 8000.

regards
Reimar





> --
> -------------------------------------------------------------------------------
> Konrad Hinsen                            | E-Mail: hinsen@xxxxxxxxxxxxxxx
> Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
> Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
> 45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
> France                                   | Nederlands/Francais
> -------------------------------------------------------------------------------

-- 
Reimar Bauer 

Institut fuer Stratosphaerische Chemie (ICG-1)
Forschungszentrum Juelich
email: R.Bauer@xxxxxxxxxxxxx
http://www.fz-juelich.de/icg/icg1/
=================================================================
a IDL library at ForschungsZentrum Juelich
http://www.fz-juelich.de/icg/icg1/idl_icglib/idl_lib_intro.html

http://www.fz-juelich.de/zb/text/publikation/juel3786.html
=================================================================

read something about linux / windows
http://www.suse.de/de/news/hotnews/MS.html

  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: