[netcdfgroup] Problem with parallel netcdf

I'm trying to build parallel enabled netcdf 4.1.3 on Fedora 16 with hdf5 1.8.7 and with both mpich2 1.4.1p1 and openmpi 1.5.4. In running make check with the openmpi build I get:

$ mpiexec -n 4 ./f90tst_parallel
[orca.cora.nwra.com:32630] *** An error occurred in MPI_Comm_d
[orca.cora.nwra.com:32630] *** on communicator MPI_COMM_WOR
[orca.cora.nwra.com:32630] *** MPI_ERR_COMM: invalid communicator
[orca.cora.nwra.com:32630] *** MPI_ERRORS_ARE_FATAL: your MPI job will now
HDF5: infinite loop closing library

D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FDFD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,D,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,F,FD,FD,FD,FD,FD,FD,FD,FD,FD

 *** Testing netCDF-4 parallel I/O from Fortran 90.
HDF5: infinite loop closing library

D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FDFD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,D,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,F,FD,FD,FD,FD,FD,FD,FD,FD,FD
HDF5: infinite loop closing library

D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FDFD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,D,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,F,FD,FD,FD,FD,FD,FD,FD,FD,FD
------------------------------------------------------------------------
mpiexec has exited due to process rank 2 with PID 32631 on
node orca.cora.nwra.com exiting improperly. There are two reasons this could 
occu

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it wa
for all processes to call "init". By rule, if one process calls "init"
then ALL processes must call "init" prior to termination

2. this process called "init", but exited without calling "finaliz
By rule, all processes that call "init" MUST call "finalize" prior
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to
terminated by signals sent by mpiexec (as reported here)
------------------------------------------------------------------------
[orca.cora.nwra.com:32628] 3 more processes have sent help message help-mpi-errors.trs_are_fatal [orca.cora.nwra.com:32628] Set MCA parameter "orte_base_help_aggregate" to 0 to see ror messages


It appears to work fine with mpich2.  Has anyone else come across this?

Thanks,

  Orion

--
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                  orion@xxxxxxxxxxxxx
Boulder, CO 80301              http://www.cora.nwra.com



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: