I assume this is a threaded application, right (posix, MPI, etc)? Otherwise
even parallel I/O likely wouldn't work. I use an MPI method to run a kind of
token ring, where a message gets sent by each thread in turn. That way all the
threads know which one is supposed to be writing to the file. When a thread is
done, it sends out a message with the number of the next thread that should
write, until they are all done.
-- Ted
On Jun 5, 2012, at 11:27 AM, Kristopher Bedka wrote:
> When I say processor, I actually mean different machines. How would the
> different machines know when another is operating on the file and hence, wait
> to write the output?
>
> On Jun 5, 2012, at 12:06 PM, Ted Mansell wrote:
>
>> If you are only using 15 processors, I would suggest using a 'round robin'
>> approach with non-parallel, chunked and compressed output. Essentially, each
>> processor writes in succession to the file (open, write, close, next
>> processor). This works really well for me for smallish numbers of processors
>> (less than 60, say). If the chunking is set up such that each processor
>> writes just its own chunks, this method works well.
>>
>> good luck!
>>
>> -- Ted
>>
>>
>> On Jun 5, 2012, at 10:23 AM, Kristopher Bedka wrote:
>>
>>> I'm not quite sure what you mean by the "application layer"? My goal was
>>> to have 15 different processors process 15 segments of a satellite orbit,
>>> where each processor would write to the same NetCDF file in the most disk
>>> space efficient manner possible without any problems with simultaneous
>>> NetCDF writes. I had previously done the compression with the
>>> "nf_def_var_chunking" function call in non-parallel NetCDF. As this
>>> function does not seem to be available in parallel NetCDF, I'd be
>>> interested in alternative suggestions to accomplish my goal. Sorry I am
>>> more of the scientist type and am not a software engineer, so I may absorb
>>> some of these concepts a little slower than others.
>>>
>>> Thanks for the help,
>>> Kris
>>>
>>> On Jun 5, 2012, at 11:13 AM, Rob Latham wrote:
>>>
>>>> On Tue, May 29, 2012 at 02:29:12PM -0600, Russ Rew wrote:
>>>>> Hi Kristopher,
>>>>>
>>>>>> I am processing a large volume of satellite data where multiple
>>>>>> processes could be simultaneously writing data to the same netcdf
>>>>>> file. This has not been supported in previous NetCDF versions and
>>>>>> I've gotten fatal errors when two simultaneous writes conflicted. I
>>>>>> now understand that recent NetCDF versions do support this
>>>>>> functionality. Could someone tell me or provide an example of what I
>>>>>> need to do (i.e. new
>>>>>> function calls, options in netcdf open, etc...) to make this work for
>>>>>> me? I've tried the pnetcdf package does not support chunking which
>>>>>> I need to internally compress these files.
>>>>>
>>>>> No, sorry, it's not supported in current netCDF versions either.
>>>>> NetCDF-4 uses HDF5 as its storage layer, and HDF5 does not support
>>>>> compression with parallel access, as explained here:
>>>>
>>>> Is there any chance you can compress at the application layer? Each
>>>> processor takes it's local hunk of data, compresses it, then writes to
>>>> the file.
>>>>
>>>> I admit, you will quickly find out why parallel writes with
>>>> compression is not already implemented in these parallel I/O
>>>> libraries!
>>>>
>>>> However, it's possible that at your application level, there may be
>>>> ways to simplify the parallel, compressed writes problem that a
>>>> general purpose library cannot use.
>>>>
>>>> ==rob
>>>>
>>>> --
>>>> Rob Latham
>>>> Mathematics and Computer Science Division
>>>> Argonne National Lab, IL USA
>>>
>>> =========================================================
>>> Kristopher Bedka
>>> Science Systems & Applications, Inc. @ NASA Langley Research Center
>>> Climate Science Branch
>>> 1 Enterprise Parkway, Suite 200
>>> Hampton, VA 23666
>>> Phone: (757) 951-1920
>>> Fax: (757) 951-1902
>>> Kristopher.m.bedka@xxxxxxxx
>>> =========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> netcdfgroup mailing list
>>> netcdfgroup@xxxxxxxxxxxxxxxx
>>> For list information or to unsubscribe, visit:
>>> http://www.unidata.ucar.edu/mailing_lists/
>>
>
> =========================================================
> Kristopher Bedka
> Science Systems & Applications, Inc. @ NASA Langley Research Center
> Climate Science Branch
> 1 Enterprise Parkway, Suite 200
> Hampton, VA 23666
> Phone: (757) 951-1920
> Fax: (757) 951-1902
> Kristopher.m.bedka@xxxxxxxx
> =========================================================
>
>
>
>
>
>