Hi Charlie,
On May 22, 2008, at 11:11 AM, Charlie Zender wrote:
Hi Quincey,
Ideas/suggestions as to what is going wrong or how to debug this
problem would be much appreciated.
Are you using a threadsafe version of HDF5? (i.e. one that is
configured with the "--enable-threadsafe" option) It's also possible
that netCDF-4 needs some locking mechanisms also, but that's a
question for Ed or Russ.
Happy to report that your suggestion appears to solve our problem.
ncbo performs as expected if, and, on SMP systems, only if, we
build HDF5 with --enable-threadsafe. We are still testing to
make sure there are no regressions on our other netCDF operators.
But things look much better, and no re-coding necessary!
This begs some questions:
1. Are there benefits to building HDF without --enable-threadsafe?
As Orion mentioned, the C++ and FORTRAN wrappers aren't currently
compatible with the threadsafe option. I don't think it would be too
hard to address this, but it's probably enough work that we should
look for some funding to do it.
The only other downside really is the increased overhead for each
HDF5 API call (to perform the semaphore locking that allows
threadsafety).
If not, can you make it the HDF/netCDF4 default? at least
on multi-core systems?
That's probably difficult for us to detect at configure time. :-/
We have learned, I think, to disable our progams' (NCO's)
threading unless it is linked to threadsafe HDF.
Otherwise users will experience unpredictable NCO failure.
2. How should we test, at NCO compile time, whether the
underlying netCDF4/HDF install is threadsafe?
The "H5_HAVE_THREADSAFE" macro will be defined when the "hdf5.h"
header is included and threadsafety is enabled.
Quincey
Thanks,
Charlie
--
Charlie Zender, Department of Earth System Science, UC Irvine
Sab. at CNRS/LGGE-Grenoble until 20080815 :) 011+33+476+824236
Laboratoire de Glaciologie et Géophysique de l'Environnement
54 rue Molière BP 96, 38402 Saint Martin d'Hères Cedex, France