NOTE: The netcdf-hdf
mailing list is no longer active. The list archives are made available for historical reasons.
NetCDF gurus: After successfully prototyping our parallel netcdf code, we have rolled it into a large community app (MFIX) and are now getting sporadic "NetCDF: HDF error" errors during runs. This, unsurprisingly, coincides with failure to write portions of related variable fields. These happen during put_vars(), and occurs across all PEs at that random time, and also only one associated PE's subsequent close() as well. In one of the smallest cases, we are writing ~100, 600K files. This problem will strike every 15 or 20 files, and will vary both in the file and the fields that are affected. With larger files it occurs more frequently - almost every other file with the 300MB files we need for production. Again, it occurs in different fields and files within runs and from run to run. We are using netcdf 4.1.3 and hdf 1.8.7. My question is, how can I possibly drill further into this problem? I am at a loss as to how to proceed. It would be nice to force HDF to be more specific, or course, but all debugging suggestions most welcome. Thanks, John Urbanic
netcdf-hdf
archives: