Ed,
Tried run tst_h_par few times, and the timing is below.
Wei Huang
huangwei@xxxxxxxx
VETS/CISL
National Center for Atmospheric Research
P.O. Box 3000 (1850 Table Mesa Dr.)
Boulder, CO 80307-3000 USA
(303) 497-8924
On Sep 20, 2011, at 3:57 PM, Ed Hartnett wrote:
> Wei Huang <huangwei@xxxxxxxx> writes:
>
>> I have run nc_test4/tst_nc4perf for 1, 2, 4, and 8 processors, results
>> attached.
>>
>> To me, the performance decreases when processors increase.
>> Someone may have a better interpret.
>
> I have looked at your nc4perf output and agree that there is some
> performance problem here. This seems to indicate that parallel I/O is not
> working well on your system for some reason.
>
>> I also run tst_parallel4, with result:
>> num_proc time(s) write_rate(B/s)
>> 1 9.2015 1.16692e+08
>> 2 12.4557 8.62048e+07
>> 4 6.30644 1.70261e+08
>> 8 5.53761 1.939e+08
>> 16 2.25639 4.75866e+08
>> 32 2.28383 4.7015e+08
>> 64 2.19041 4.90202e+08
>
> Yet this test clearly is working, as the time decreases for all
> processors past 2, until about 16, at which point the I/O system is
> saturated, and then performance levels off. This is what I would expect.
>
> But these test results are not compatible with your nc4perf results.
>
>> We can modify this program to mimic our data size, but do not know
>> if this will help us.
>
>>>
>>> If the program shows that parallel I/O is not working, take a look at
>>> the netCDF test program h5_test/tst_h_par.c. This is a HDF5-only program
>>> (no netcdf code at all) that does parallel I/O. If this program does not
>>> show that parallel I/O is working, then your problem is not with the
>>> netCDF layer, but somewhere in HDF5 or even lower in the stack.
>
> Try timing tst_h_par for several different numbers of processors, and
> see if you get a performance improvement there.
*** Creating file for parallel I/O read, and rereading it...
p= 1, write_rate=113.568, read_rate=51.5761
p= 2, write_rate=142.687, read_rate=239.493
p= 4, write_rate=543.575, read_rate=1280.54
p= 8, write_rate=167.021, read_rate=1398.42
p=16, write_rate=204.08, read_rate=1555.1
p=32, write_rate=72.7069, read_rate=720.396
p=64, write_rate=40.2151, read_rate=358.09
*** Creating file for parallel I/O read, and rereading it...
p= 1, write_rate=117.562, read_rate=733.768
p= 2, write_rate=358.092, read_rate=1457.53
p= 4, write_rate=528.873, read_rate=1439.01
p= 8, write_rate=230.93, read_rate=1282.31
p=16, write_rate=174.401, read_rate=468.23
p=32, write_rate=98.98, read_rate=2057.22
p=64, write_rate=103.817, read_rate=794.755
*** Creating file for parallel I/O read, and rereading it...
p=1, write_rate=114.031, read_rate=770.388
p=2, write_rate=425.982, read_rate=1429.43
p=4, write_rate=428.331, read_rate=1393.3
p=8, write_rate=344.846, read_rate=1397.72
p=16, write_rate=288.448, read_rate=1239.88
p=32, write_rate=102.718, read_rate=2751.15
p=64, write_rate=62.3665, read_rate=879.375
>
> Thanks,
>
> Ed
>
> --
> Ed Hartnett -- ed@xxxxxxxxxxxxxxxx