NOTE: The netcdf-hdf
mailing list is no longer active. The list archives are made available for historical reasons.
Quincey Koziol <koziol@xxxxxxxxxxxx> writes: > I do think it's better to force the user to give you a chunk > size. Definitely _don't_ use a chunk size of one, the B-tree to > locate the chunks will be insanely huge. :-( The user may specify a chunksize in netCDF-4. With a 1 MB chunksize, wow, it's sure a whole lot faster! Now it takes less than a second. Also the output file is only 4 MBs. Is that expected? I presume this is because it does not write more than 1 MB for each of the 4 variables. Neat! Here's the netCDF code to do chunking. (Note the nc_def_chunking call after the nc_def_var call.) chunksize[0] = MEGABYTE/DOUBLE_SIZE; for (i = 0; i < NUMVARS; i++) { if (nc_def_var(ncid, var_name[i], NC_DOUBLE, NUMDIMS, dimids, &varid[i])) ERR; if (nc_def_var_chunking(ncid, i, NULL, chunksize, NULL)) ERR; } if (nc_enddef(ncid)) ERR; for (i = 0; i < NUMVARS; i++) if (nc_put_var1_double(ncid, i, index, &pi)) ERR; bash-3.2$ time ./tst_large *** Testing really large files in netCDF-4/HDF5 format, quickly. *** Testing create of simple, but large, file...ok. *** Tests successful! real 0m0.042s user 0m0.014s sys 0m0.028s bash-3.2$ ls -l tst_large.nc -rw-r--r-- 1 ed ustaff 4208887 2007-08-21 13:52 tst_large.nc > However, if you are going to attempt to create a heuristic for > picking a chunk size, here's my best current thoughts on it: try to > get a chunk of a reasonable size (1MB, say) (but make certain that it > will contain at least one element, in the case of _really_ big > compound datatypes :-), then try to make the chunk as "square" as > possible (i.e. try to get the chunk size in all dimensions to be > equal). That should give you something reasonable, at least... ;-) Thanks! Ed -- Ed Hartnett -- ed@xxxxxxxxxxxxxxxx
netcdf-hdf
archives: