Re: performance degrades with filesize

>>>>> "Ethan" == Ethan Alpert <ethan@xxxxxxxxxxxx> writes:

    Ethan> I don't know anything about the underlying format and it's
    Ethan> implementation but I have experienced the performance
    Ethan> degradation you are describing. Growing the unlimited dimension
    Ethan> is the cause. I can't be certain but it seems like the entire
    Ethan> file is rewritten when the unlimited dimension increases.  Also

I think that it is just writing a high dimension, not necessarily growing
it.  If it was actually the growing that was slow, then my test of writing
the same high frame over and over (timestep 5000, in my case) would only be
slow on the first write.

    Ethan> if I have 5 variables all using the unlimited dimension and I
    Ethan> increase the unlimited dimension the file size increase 5
    Ethan> times. This means that something is going through the *entire*
    Ethan> file every time to make more space for each of the variables.

My own experience tends to agree with this, except that I think it happens
not only when you need more space, but anytime you write to the file.

    Ethan> My suggestion is not use the unlimited dimension to create your
    Ethan> files. If at all possible predefine all variables and attributes
    Ethan> x in one define mode. If you do this you won't incure this
    Ethan> penalty of re-writting the file every time you grow a dimension.

Well, I could save growing the dimension by allocating a huge file at
first, and then chop off the unused portion when my simulation completed.
I could also grow the time dimension only occasionally, but not every
timestep.  Unfortunately, I don't think that this is the whole issue.  I
suppose that not using an unlimited dimension might be a work around, but
it is basically a stab in the dark.  It is probably just a bug somewhere
(it could be in my own code), because I would think that you could append
the file fast as far as how the underlying C library works.

Thanks,  
         John

-- 
John Galbraith                    email: john@xxxxxxxxxxxxxxx
Los Alamos National Laboratory,   home phone: (505) 662-3849
                                  work phone: (505) 665-6301

  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: