Re: [netcdfgroup] abysmal performance

There is an option to nccopy, -u option, that copies a file
while converting all unlimited dimensions to fixed size.
As an experiment, you might nccopy -u your file and then rerun
you performance tests to see if they improve.
=Dennis Heimbigner
 Unidata

On 6/2/2016 2:35 PM, Burlen Loring wrote:
That sounds like it, ncdump on 1 file shows "time = UNLIMITED ; // (8
currently)" it's kind of unexpected that these 8 values not be in a
contiguous array! Oh well. Thanks for clarifying. This is simulation
output, so our options may be limited. I will be sure to mention this to
the scientists. Hopefully they can write them as a fixed dimension.

On 06/02/2016 01:24 PM, Bowman, Kenneth P wrote:
Hi Burlen,

If time is your unlimited (record) dimension, then the time values are
scattered through the 433 MB file.  That is true for any variables
that have a time dimension.  To read the time variable, the netCDF
library has to jump through the file and collect the values.

The longitude variable is contiguous in memory and can be read quickly.

If you know the number of time steps in the file before you write the
file, you can change the unlimited time dimension to a fixed
dimension.  Then something dimensioned by (only) time will be
contiguous in memory.

Or you can rewrite the files with fixed dimensions.  That read
performance penalty is one of the tradeoffs of having the flexibility
of an unlimited dimension.

Good luck!

Ken


Date: Thu, 2 Jun 2016 12:41:53 -0700
From: Burlen Loring <bloring@xxxxxxx <mailto:bloring@xxxxxxx>>
To: Tom Fogal <tfogal@xxxxxxxxxxxx
<mailto:tfogal@xxxxxxxxxxxx>>, 
<mailto:netcdfgroup@xxxxxxxxxxxxxxxx>netcdfgroup@xxxxxxxxxxxxxxxx
Subject: Re: [netcdfgroup] abysmal performance
Message-ID: <961631fd-2aad-d348-ce1d-8a70a9e67287@xxxxxxx
<mailto:961631fd-2aad-d348-ce1d-8a70a9e67287@xxxxxxx>>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Tom,

That's not an option, and it has it's own issues. for example if file
size exceeds the size of a tape drive we can't archive it. Beside it
doesn't seem like a lustre metadata issue, open is relatively fast, like
0.096 sec. and wouldn't explain why reading the time dimension with only
8 values takes on the order of 1 sec while reading the lon dimension
with 1152 values takes on the order of 1e-4 sec. ?

Burlen



-----------------------------------------------------------------------------
Dr. Kenneth P. Bowman                                    1014A Eller
Building
David Bullock Harris Professor of Geosciences            979-862-4060
Department of Atmospheric Sciences                       979-862-4466 fax
Texas A&M University
3150 TAMU
College Station, TX   77843-3150

_http://atmo.tamu.edu/people/faculty/bowmankenneth.html_







_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/




  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: