Thanks Rob.
I have another question. How much of performance gain would we get from
using parallel-netcdf-c (or say MPI-IO) on a mainstream desktop PC (one
multi-core cpu with one hdd or ssd)? Currently we dont have a parallel file
system environment (software/hardware) at out lab.
Thanks
Drew
On Fri, Feb 24, 2017 at 11:50 AM, Rob Latham <robl@xxxxxxxxxxx> wrote:
>
>
> On 02/22/2017 10:46 AM, Zhiyu (Drew) Li wrote:
>
>> Hi there,
>>
>> I am playing with the parallel-netcdf-c examples to learn if I could
>> apply this technology to improve netcdf i/o in my project. I got some
>> questions about this example tst_parallel4.c found at
>> https://github.com/Unidata/netcdf-c/blob/master/nc_test4/tst_parallel4.c
>> <https://github.com/Unidata/netcdf-c/blob/master/nc_test4/tst_parallel4.c
>> >.
>>
>> I saw the statements "nc_var_par_access(ncid, varid, NC_COLLECTIVE)" and
>> "nc_var_par_access(ncid, varid, NC_INDEPENDENT)" are commented out on
>> lines 133 and 134
>> (https://github.com/Unidata/netcdf-c/blob/master/nc_test4/ts
>> t_parallel4.c#L133
>> <https://github.com/Unidata/netcdf-c/blob/master/nc_test4/ts
>> t_parallel4.c#L133>).
>>
>> Q1: Is this nc_var_par_access() statement optional?
>>
>
> It's optional. I like to add it to make explicit if I am requesting
> indepnedent i/o or collective i/o. Long ago the docs and the
> implementation differed on what was the default. I make it explicit and
> don't have to worry.
>
> Q2: I enabled each of the two lines one at a time to test NC_COLLECTIVE
>> mode and NC_INDEPENDENT mode separately. Each test was ran with 4
>> processes (mpiexec -np 4 ./tst_parallel4). Then I used jumpshot to
>> visualize the clog2 files they produced. The snapshots are attached
>> below. The green bars represent "Write to netcdf file" events (I turn
>> off other bars (other mpi events) in visualization).
>>
>> Inline image 1
>> NC_INDEPENDENT mode
>> In NC_INDEPENDENT mode, the Write events occurred at different time
>> steps in the 4 processes (the x-axis is time step). If I understood it
>> correctly, although we had 4 processes running in parallel, the Write
>> events still happened in sequence, not in parallel, because p0 wrote
>> first, then p1 wrote, and then p2, and then p3 wrote last. Is it
>> supposed to be like this???
>>
>
> It is. Look a few lines above where the test inserts some sleep calls if
> USE_MPE is defined (I guess to make it more visually interesting?)
>
> https://github.com/Unidata/netcdf-c/blob/master/nc_test4/tst
> _parallel4.c#L130
>
> NC_COLLECTIVE mode
>>
>> In NC_COLLECTIVE mode, p0 started writing first but its Write event
>> lasted until the fourth process p3 finished writing. I thought all the
>> four process should start and stop writing at the same time in
>> NC_COLLECTIVE mode???
>>
>
> If there are sleep calls in the test, then some processes will reach the
> collective call later. The test does demonstrate the one big drawback of
> collective calls: if there is skew, then a "pseudo-synchronization" occurs
> as the first process cannot make progress until the last process enters the
> collective.
>
> (note: in this case all processes leave the collective at about the same
> time. that's not necessarily guaranteed by a collective operation, not
> even MPI_BARRIER).
>
> The MPE traces you have shown are consistent with the test.
>
> I'm so pleased you are using MPE. We haven't had funding to work on it
> for a few years, but it still comes in handy!
>
> ==rob
>
>
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web. Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/
>