NOTE: The netcdf-hdf
mailing list is no longer active. The list archives are made available for historical reasons.
Hi Ed, I don't think HDF5 will write only the last value since you are asking HDF5 to create that size of big dataset. It will write 17179869152 bytes + overhead into the disk. So depending on your system, it may take minutes. Quincey may give you more technical explanations. I don't know if using chunks may help you much. However, I think this is a good case to apply compression filter since it will compress very well and should overcome I/O time. Kent Quoting Ed Hartnett <ed@xxxxxxxxxxxxxxxx>: > Howdy all! > > I am writing a test program which writes large files (well over 2 > GB). I have some questions about HDF5 and very large files. I need to > check out whether netCDF-4 has been correctly implemented for best > performance. > > In the program below, I create 4 datasets, of type double. They are > one-dimensional, with length 2147483644/4. (That is 17179869152 bytes > of data.) > > Then I write the last value only in each dataset. > > Took a really long time - minutes. Is this expected? What is HDF5 > doing in the background here? Is there something I can do with > chunking here to improve the speed of this program? > > I am not setting a fill calue, so what is being written here? I > naively expected that HDF5 would not write all the data I am skipping, > but would find a way to write data only around the value that I am > actually writing... > > The file that this program creates is 17179883735 bytes, which is > 14583 bytes of HDF5 overhead. Is that about what is expected? > > Any comments welcome... > > Thanks, > > Ed > > /* > Copyright 2007, UCAR/Unidata > See COPYRIGHT file for copying and redistribution conditions. > > This program (quickly, but not throughly) tests the large file > features of netCDF-4. > > $Id: tst_large.c,v 1.3 2007/08/18 12:26:38 ed Exp $ > */ > #include <config.h> > #include <nc_tests.h> > #include <netcdf.h> > #include <stdio.h> > #include <string.h> > > /* This is the magic number for classic format limits: 2 GiB - 4 > bytes. */ > #define MAX_CLASSIC_BYTES 2147483644 > > /* This is the magic number for 64-bit offset format limits: 4 GiB - 4 > bytes. */ > #define MAX_64OFFSET_BYTES 4294967292 > > /* Handy for constucting tests. */ > #define QTR_CLASSIC_MAX (MAX_CLASSIC_BYTES/4) > > /* We will create this file. */ > #define FILE_NAME "tst_large.nc" > > int > main(int argc, char **argv) > { > > printf("\n*** Testing really large files in netCDF-4/HDF5 format, > quickly.\n"); > > printf("\n*** Testing create of simple, but large, file..."); > { > #define DIM_NAME "Time_in_nanoseconds" > #define NUMDIMS 1 > #define NUMVARS 4 > > int ncid, dimids[NUMDIMS], varid[NUMVARS]; > char var_name[NUMVARS][NC_MAX_NAME + 1] = {"England", "Scotland", > "Ireland", "Wales"}; > size_t index[2] = {QTR_CLASSIC_MAX-1, 0}; > int ndims, nvars, natts, unlimdimid; > nc_type xtype; > char name_in[NC_MAX_NAME + 1]; > size_t len; > double pi = 3.1459, pi_in; > int i; > > /* Create a netCDF netCDF-4/HDF5 format file, with 4 vars. */ > if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR; > if (nc_set_fill(ncid, NC_NOFILL, NULL)) ERR; > if (nc_def_dim(ncid, DIM_NAME, QTR_CLASSIC_MAX, dimids)) ERR; > for (i = 0; i < NUMVARS; i++) > { > if (nc_def_var(ncid, var_name[i], NC_DOUBLE, NUMDIMS, > dimids, &varid[i])) ERR; > } > if (nc_enddef(ncid)) ERR; > for (i = 0; i < NUMVARS; i++) > if (nc_put_var1_double(ncid, i, index, &pi)) ERR; > if (nc_close(ncid)) ERR; > > /* Reopen and check the file. */ > if (nc_open(FILE_NAME, 0, &ncid)) ERR; > if (nc_inq(ncid, &ndims, &nvars, &natts, &unlimdimid)) ERR; > if (ndims != NUMDIMS || nvars != NUMVARS || natts != 0 || unlimdimid > != -1) ERR; > if (nc_inq_dimids(ncid, &ndims, dimids, 1)) ERR; > if (ndims != 1 || dimids[0] != 0) ERR; > if (nc_inq_dim(ncid, 0, name_in, &len)) ERR; > if (strcmp(name_in, DIM_NAME) || len != QTR_CLASSIC_MAX) ERR; > for (i = 0; i < NUMVARS; i++) > { > if (nc_inq_var(ncid, i, name_in, &xtype, &ndims, dimids, &natts)) ERR; > if (strcmp(name_in, var_name[i]) || xtype != NC_DOUBLE || ndims != 1 > || > dimids[0] != 0 || natts != 0) ERR; > if (nc_get_var1_double(ncid, i, index, &pi_in)) ERR; > if (pi_in != pi) ERR; > } > if (nc_close(ncid)) ERR; > } > > SUMMARIZE_ERR; > FINAL_RESULTS; > } > > > -- > Ed Hartnett -- ed@xxxxxxxxxxxxxxxx > > _______________________________________________ > netcdf-hdf mailing list > netcdf-hdf@xxxxxxxxxxxxxxxx > For list information or to unsubscribe, visit: > http://www.unidata.ucar.edu/mailing_lists/ >
netcdf-hdf
archives: