Re: [netcdfgroup] simultaneous NetCDF writes to same file

I assume this is a threaded application, right (posix, MPI, etc)? Otherwise 
even parallel I/O likely wouldn't work. I use an MPI method to run a kind of 
token ring, where a message gets sent by each thread in turn. That way all the 
threads know which one is supposed to be writing to the file. When a thread is 
done, it sends out a message with the number of the next thread that should 
write, until they are all done.

-- Ted


On Jun 5, 2012, at 11:27 AM, Kristopher Bedka wrote:

> When I say processor, I actually mean different machines.  How would the 
> different machines know when another is operating on the file and hence, wait 
> to write the output?
> 
> On Jun 5, 2012, at 12:06 PM, Ted Mansell wrote:
> 
>> If you are only using 15 processors, I would suggest using a 'round robin' 
>> approach with non-parallel, chunked and compressed output. Essentially, each 
>> processor writes in succession to the file (open, write, close, next 
>> processor). This works really well for me for smallish numbers of processors 
>> (less than 60, say). If the chunking is set up such that each processor 
>> writes just its own chunks, this method works well.
>> 
>> good luck!
>> 
>> -- Ted
>> 
>> 
>> On Jun 5, 2012, at 10:23 AM, Kristopher Bedka wrote:
>> 
>>> I'm not quite sure what you mean by the "application layer"?  My goal was 
>>> to have 15 different processors process 15 segments of a satellite orbit, 
>>> where each processor would write to the same NetCDF file in the most disk 
>>> space efficient manner possible without any problems with simultaneous 
>>> NetCDF writes.  I had previously done the compression with the 
>>> "nf_def_var_chunking" function call in non-parallel NetCDF.  As this 
>>> function does not seem to be available in parallel NetCDF, I'd be 
>>> interested in alternative suggestions to accomplish my goal.  Sorry I am 
>>> more of the scientist type and am not a software engineer, so I may absorb 
>>> some of these concepts a little slower than others.
>>> 
>>> Thanks for the help,
>>> Kris
>>> 
>>> On Jun 5, 2012, at 11:13 AM, Rob Latham wrote:
>>> 
>>>> On Tue, May 29, 2012 at 02:29:12PM -0600, Russ Rew wrote:
>>>>> Hi Kristopher,
>>>>> 
>>>>>> I am processing a large volume of satellite data where multiple
>>>>>> processes could be simultaneously writing data to the same netcdf
>>>>>> file.   This has not been supported in previous NetCDF versions and
>>>>>> I've gotten fatal errors when two simultaneous writes conflicted.  I
>>>>>> now understand that recent NetCDF versions do support this
>>>>>> functionality.  Could someone tell me or provide an example of what I
>>>>>> need to do (i.e. new
>>>>>> function calls, options in netcdf open, etc...) to make this work for
>>>>>> me?    I've tried the pnetcdf package does not support chunking which
>>>>>> I need to internally compress these files.
>>>>> 
>>>>> No, sorry, it's not supported in current netCDF versions either.
>>>>> NetCDF-4 uses HDF5 as its storage layer, and HDF5 does not support
>>>>> compression with parallel access, as explained here:
>>>> 
>>>> Is there any chance you can compress at the application layer?  Each
>>>> processor takes it's local hunk of data, compresses it, then writes to
>>>> the file.
>>>> 
>>>> I admit, you will quickly find out why parallel writes with
>>>> compression is not already implemented in these parallel I/O
>>>> libraries!
>>>> 
>>>> However, it's possible that at your application level, there may be
>>>> ways to simplify the parallel, compressed writes problem that a
>>>> general purpose library cannot use.
>>>> 
>>>> ==rob
>>>> 
>>>> -- 
>>>> Rob Latham
>>>> Mathematics and Computer Science Division
>>>> Argonne National Lab, IL USA
>>> 
>>> =========================================================
>>> Kristopher Bedka
>>> Science Systems & Applications, Inc. @ NASA Langley Research Center
>>> Climate Science Branch
>>> 1 Enterprise Parkway, Suite 200
>>> Hampton, VA 23666
>>> Phone:  (757) 951-1920
>>> Fax: (757) 951-1902
>>> Kristopher.m.bedka@xxxxxxxx
>>> =========================================================
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> netcdfgroup mailing list
>>> netcdfgroup@xxxxxxxxxxxxxxxx
>>> For list information or to unsubscribe,  visit: 
>>> http://www.unidata.ucar.edu/mailing_lists/
>> 
> 
> =========================================================
> Kristopher Bedka
> Science Systems & Applications, Inc. @ NASA Langley Research Center
> Climate Science Branch
> 1 Enterprise Parkway, Suite 200
> Hampton, VA 23666
> Phone:  (757) 951-1920
> Fax: (757) 951-1902
> Kristopher.m.bedka@xxxxxxxx
> =========================================================
> 
> 
> 
> 
> 
> 



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: