Re: [netcdfgroup] File with large number of variables

Dani <pressec@xxxxxxxxx> writes:

> Hi,
> I have to write and read data to/from a netcdf file that has 750
> variables, all of them using unlimited dimensions (only one per
> variable, some dimensions shared) and 10 fixed dimensions.
>
> I have use netcdf-4 (because of the multiple unlimited dimensions
> requirement) and C API.
>
> I'm making some prototyping on my development machine (Linux 2GB RAM)
> and found several performance issues that I hope someone can help me
> fix/understand:
>
> (1) when i create a file and try to define 1000 variables (all int)
> and a single shared unlimited dimension, the process takes all
> available RAM (swap included) and fails with "Error (data:def closed)
> -- HDF error" after a (long)while.
>
> If I do the same closing and opening the file again every 10 or 100
> new definitions, it works fine.  I can bypass this by creating the
> file once (ncgen) and using a copy of it on every new file, but I
> would prefer not to. Why does creating the variables take that much
> memory?

When you create a netCDF variable, HDF5 allocates a buffer for that
variable. The default size of the buffer is 1 MB. 

I have reproduced your problem, but it can be solved be explicitly
setting the buffer size for each variable to a lower value. I have
checked in my tests in libsrc4/tst_vars3.c, but here's the part with the
cache setting:

      for (v = 0; v < NUM_VARS; v++)
      {
         sprintf(var_name, "var_%d", v);
         if (nc_def_var(ncid, var_name, NC_INT, 1, &dimid, &varid)) ERR_RET;
         if (nc_set_var_chunk_cache(ncid, varid, 0, 0, 0.75)) ERR_RET;
      }

Note the call to nc_set_var_chunk_cache(), right after the call to
nc_def_var.

When I take this line out, I get a serious slowdown around 4000
variables. (I have more memory available than you do.)

But when I add the call to set_var_chunk_cache(), setting the chunk
cache to zero, then there is no slowdown, even for 10,000 variables.

Thanks,

Ed
-- 
Ed Hartnett  -- ed@xxxxxxxxxxxxxxxx



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: