The issue of multi-threaded read and write of NetCDF files has repeatedly arisen in the netcdf mailing list (netcdfgroup@unidata.ucar.edu).
To date -- WRT netcdf -- there have been two notions of thread-safe:
- Allow multiple threads to operate as long as they are operating on different files.
- Allow multiple threads to operate on the same file.
Case 1 is doable -- just time consuming to implement. In fact a netcdf-c branch exists that should allow this for netcdf 3 (classic) files. The approach is to isolate all mutable global state used by the library and surround operations on that state (both read and write) with a mutex lock. Since none of the state accesses are all that long, this should not affect performance very much. Note that an implicit assumption is that all c-library calls (esp. malloc) are or can be made thread-safe.
Case 2 is more interesting and significantly harder to implement because of the need for a more fine grain locking.
This post seeks to explore a possible approach to allowing such threaded IO. Note that this is mostly (but not entirely) independent of MPIO style parallel IO; providing this capability might allow MPIO to work faster.
Multi-Threaded Read/Write For NetCDF-3 Files
The approach proposed here is to allow multi-threaded IO in a significantly restricted way. The basic idea is to separate out meta-data management from IO.
In this proposal, we assume that a single thread is responsible for creating a file (or reading the metadata of an existing file). However, the reading and/or writing of data into variables is allowed to be done simultaneously with multiple threads. This is a form of the master-slave parallelism model[1].
This is implemented by providing case #1 locking when doing anything that can affect the metadata. Further, it must be enforced that attempts to read/write data must be carried out in the context of fixed meta-data. Note that the meta-data includes changes to the size of unlimited dimensions. which turn can cause the on-disk layout to change.
Range Locking
Assuming the above, the approach to IO is to use what is called range-locking[2]. The idea is that a thread "locks" a specific contiguous range of bytes from the file. A lock manager for range locking allows disjoint ranges to be read/written simultaneously, but overlapping ranges must be serialized. One extension is to allow the lock manager to indicate the specific overlap with respect to two (or more) ranges. It is possible, but tricky, then for a thread to be told what part of its requested range it can write without blocking.
As an aside, I should note that range locking and btree locking are closely related (One can use a btree-like structure to manage range locking, for examploe). I would speculate that modifying HDF5 to allow fine grain read-write would be doable using btree-locking.
[1] How to Write Parallel Programs: A First Course, Nicholas Carriero and David Gelernter, pub. October 29, 1990.
[2] Transaction Processing: Concepts and Techniques, Jim Gray and Andreas Reuter, Morgan Kaufmann Publishers, 1993.