> I got from Wei-Keng answer that this could be a bug of the version of netcdf
> which I'm using and that I should upgrade to 4.7.4.
> Is that right ?
Yes. I suggest to use the latest releases of NetCDF-C and NetCDF-Fortran.
> 1) Now at first we thought: it would be great if, after the interrupted run,
> the call nf90_get_var could check which values are filled and which are not.
> Let's say my netcdf variable is
> 1.23423, 4.3452 , 5.3453, 7.34534, _, _, _, ...
> i.e. only the first 4 values where computed and we need to restart from the
> 5th,
> But we did not figure out a way. So first question is: is there a way to
> check that ?
If your program periodically writes data to the same file, you can consider
adding a ’time’ dimension to your variable to make it a “record” variable.
For example,
/* define dimensions time, Y, and X and variable of size [time][Y][X] */
err = nc_def_dim(ncid, "time", NC_UNLIMITED, &dims[0]); ERR
err = nc_def_dim(ncid, "Y", global_ny, &dims[1]); ERR
err = nc_def_dim(ncid, "X", global_nx, &dims[2]); ERR
err = nc_def_var(ncid, "rec_var", NC_FLOAT, 3, dims, &rec_var); ERR
/* set subarray sizes and offsets */
start[1] = NY * rank_y;
start[2] = NX * rank_x;
count[0] = 1;
count[1] = NY;
count[2] = NX;
Checkpoint 0 writes to the 0th record:
start[0] = 0;
err = nc_put_vara_float(ncid, rec_var, start, count, buf); ERR
Checkpoint 1 writes to the 1st record:
start[0] = 1;
err = nc_put_vara_float(ncid, rec_var, start, count, buf); ERR
The number of “records” is shown from command “ncdump -h output.nc”, e.g.
time = UNLIMITED ; // (2 currently)
This value is updated in the file when a write call returns successfully,
but is shared among all variables that are defined using this dimension.
So, say the value of time dimension is j, you can consider all time steps
from 0, 1, to j-2 are safe in the file. If there is only one variable in
the file, then 0, 1, ..., j-1 are safe.
> Here another question. Our code started with netcdf.
> Then we evolved to parallel I/O and the only way we found was via HDF5.
> 1) is there any alternative ?
NetCDF can now be build on top of PnetCDF, using configure option
--enable-pnetcdf.
It allows you to perform parallel I/O on classic files, i.e. CDF-1, CDF-2, and
CDF-5.
Information about PnetCDF can be found in https://parallel-netcdf.github.io
Wei-keng