Re: performance degrades with filesize

Russ, your program runs in constant time for me, too.  Konrad Hinson found
the problem (in my case) is that I was calling the Python module
incorrectly that resulted in rewriting the whole scalar variable (the one
with only the unlimited dimension) every time.  The reason that I needed to
have both kinds of variables to see the problem was undoubtably that the
scalar value was needed for the slowdown, and the vector value was needed
to make the file big enough to notice it.

Basically, this fix is to change

        try:
            v[time,:] = uniform(-1, 1, tuple(list(v.shape[1:])))
        except IndexError:
            v.assignValue(uniform(-1, 1))

to 

        try:
            v[time,:] = uniform(-1, 1, tuple(list(v.shape[1:])))
        except IndexError:
            v[time] = (uniform(-1, 1))

and the python module will only write the last scalar value (which is the
correct behaviour).

I could not duplicate Konrad's result by removing the nc_inq_dimlen()
call.  I do not know what was happening with that.  Also, this problem was
pretty obscure and probably would not be seen by somebody more experienced
with NetCDF and the python interface.

One poster reminded us to set the fill value (like it says in the
netcdf documentation somewhere) for good performance, which is undoubtably
a good idea.

By the way, this is my first interaction with netcdfgroup and it was great
to get such relevant feedback so quickly.  And, I am relieved that I can just
write better code instead of hacking up the netcdf library... 8-)

John

-- 
John Galbraith                    email: john@xxxxxxxxxxxxxxx
Los Alamos National Laboratory,   home phone: (505) 662-3849
                                  work phone: (505) 665-6301

  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: