Re: [netcdfgroup] Whether or not unlimited dimensions affect access performance?

  • To: chris wen <chriswen11@xxxxxxxxx>
  • Subject: Re: [netcdfgroup] Whether or not unlimited dimensions affect access performance?
  • From: Russ Rew <russ@xxxxxxxxxxxxxxxx>
  • Date: Tue, 19 Jul 2011 10:04:09 -0600
Hi Chris,

> Will the performance of access to NetCDF degrades if I set every dimension
> unlimited?

Short answer: yes.

Explanation: when you make a dimension unlimited, variables that use
that dimension use chunked storage rather than contiguous storage, which
incurs some storage overhead for B-tree indexing of the resulting chunks
and for partially written chunks.

Also, variables that use unlimited dimensions use 1, by default, for the
chunk length along unlimited dimension axes.  That's a reasonable
default for multidimensional variables that have only one unlimited
dimension, but if all dimensions are unlimited, the default would make
chunk sizes for all variables only big enough to hold 1 value, which
would be a very inefficient use of chunking.

If you specify chunk lengths explicitly for each variable, and if you
intend to append an unknown amount of data along every dimension, it may
make sense to set every dimension unlimited.  Otherwise, it would be
better to only declare a few dimensions unlimited, those along which
data will be appended.

--Russ



  • 2011 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: