I'm having issues getting NetCDF to work with the HDF5 backend
("NetCDF4" files). 64bit offset files are working fine, even in
parallel [0].
I get the following error at nc_create_par:
** ERROR **: could not create mask-data output file: -105, Can't add
HDF5 file metadata
Occasionally, some of my processes get a bit further, completing a few
def_dim and a def_var call; then I get:
** ERROR **: finishing definitions failed: HDF error (-101)
from those processes' nc_enddef call. Note the leading messages come
from my software, not nc_strerror.
I have tried with both the NC_MPIPOSIX and NC_MPIIO flags. I've tried
with an info object created from an MPI_Info_create call, as well as
MPI_INFO_NULL.
My nc_create_par call is pretty simple; basically something like:
nc_create_par(maskfile, NC_NETCDF4 | NC_MPIIO, MPI_COMM_WORLD,
MPI_INFO_NULL, &nc);
of course, as I mentioned the flags/info change up a bit depending on
what I'm testing.
The NetCDF installation on the cluster I'm using lacks the parallel
functions, so I've built HDF5 and NetCDF in my $HOME. These seemed to
go through fine. The configure lines I used for both are at the end of
this email [1,2]. I'm using OpenMPI 1.3.3 for MPI. The stack is built
with the v11.1 intel compiler. HDF5 is version 1.8.5-patch1 and NetCDF
is v4.1.1.
What could be going wrong?
-tom
[0] Though I'm seeing incredibly poor scalability, which is part
of what made me want to look at the NetCDF4 backend. I observe
*increasing* runtime for a strong scalability study. Is this
consistent with others' experiences?
[1] HDF5 configure line:
./configure CC=mpicc --prefix=${HOME}/sw --enable-parallel --disable-fortran
[2] NetCDF configure line:
./configure CC=mpicc --prefix=${HOME}/sw --with-hdf5=${HOME}/sw \
--enable-c-only --enable-netcdf4 --disable-dap --enable-shared --with-pic