Re: performance degrades with filesize

>>>>> "Russ" == Russ Rew <russ@xxxxxxxxxxxxxxxx> writes:

    Russ> There still may be a bug under some other combination of
    Russ> circumstances that is causing an anomalous performance problem,
    Russ> but we'll need to be able to duplicate it here to determine what
    Russ> the problem is.

I have worked up a python script that clearly shows the problem on my
system.  In doing so, I discovered that the slowdown only occurs if you
have both scalar variables and higher dimensional variables together.  In
Konrad Hinson's netcdf module, scalar values are treated slightly
differently and are written with ncvarput1(), instead of ncvarputg() like
all other array shapes.

Russ, if you don't use python then I can try to port this to C.

This script makes one vector variable and one scalar variable.  It then
writes 1000 garbage values to the file while timing the loop.  On my
system, the printed values start at zero and linearly approach .7 seconds
per ten iterations.

John


from Scientific.IO.NetCDF import NetCDFFile
from Numeric import *
from RandomArray import uniform
from time import clock

cdf = NetCDFFile('garbage.nc', 'w')

dims = [10, 50, 23, 15, 125]
for i in range(len(dims)):
    cdf.createDimension('x%d' % i, dims[i])
cdf.createDimension('time', None)
vardims = [
        ('time', 'x1', 'x2'),
        ('time',)
        ]

vars = []
for i in range(len(vardims)):
    vars.append(cdf.createVariable('y%d' % i, Float, vardims[i]))
    
time = 0
c = clock()
for time in range(1000):
    for v in vars:
        try:
            v[time,:] = uniform(-1, 1, tuple(list(v.shape[1:])))
        except IndexError:
            v.assignValue(uniform(-1, 1))
    
    if time % 10 == 0 :
        new_c = clock()
        print new_c - c
        c = new_c


-- 
John Galbraith                    email: john@xxxxxxxxxxxxxxx
Los Alamos National Laboratory,   home phone: (505) 662-3849
                                  work phone: (505) 665-6301

  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: