[netcdfgroup] Concurrent writes to netcdf3, what goes wrong?

Hi,

I'm trying to convert about 90GB of NWP data 4 times daily from grib to
netcdf. The grib-files arrive as fast as the data can be downloaded from
the HPC machines. They come by 10 files/forecast timestep.

Currently, I manage to convert 1 file/forecast timestep and I would like
to parallelize the conversion into independent jobs (i.e. neither MPI or
OpenMP), with a theoretical performance increase of 10. The underlying
IO system is fast enough to handle 10 jobs, and I have enough CPUs, but
the concurrently written netcdf-files show data which is only written
half to the disk, or mixed with other slices.

What I do is create a _FILL_VALUE 'template' file, containing all
definitions before the NWP job runs. When a new set of files arrives,
the data is put to the respective data-slices which don't have any
overlap, there is never a redefine, only functions like: nc_put_vara_*
with different slices.

Since the nc_put_vara_* calls are non-overlapping, I hoped that this
type of concurrent write would work - but it doesn't. Is my idea really
so bad to write data in parallel (e.g. there are internal buffers which
are rewritten)? Any ideas how to improve the conversion process?

Best regards,

Heiko



  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: