Wei Huang <huangwei@xxxxxxxx> writes:
> I have run nc_test4/tst_nc4perf for 1, 2, 4, and 8 processors, results
> attached.
>
> To me, the performance decreases when processors increase.
> Someone may have a better interpret.
I have looked at your nc4perf output and agree that there is some
performance problem here. This seems to indicate that parallel I/O is not
working well on your system for some reason.
> I also run tst_parallel4, with result:
> num_proc time(s) write_rate(B/s)
> 1 9.2015 1.16692e+08
> 2 12.4557 8.62048e+07
> 4 6.30644 1.70261e+08
> 8 5.53761 1.939e+08
> 16 2.25639 4.75866e+08
> 32 2.28383 4.7015e+08
> 64 2.19041 4.90202e+08
Yet this test clearly is working, as the time decreases for all
processors past 2, until about 16, at which point the I/O system is
saturated, and then performance levels off. This is what I would expect.
But these test results are not compatible with your nc4perf results.
> We can modify this program to mimic our data size, but do not know
> if this will help us.
>>
>> If the program shows that parallel I/O is not working, take a look at
>> the netCDF test program h5_test/tst_h_par.c. This is a HDF5-only program
>> (no netcdf code at all) that does parallel I/O. If this program does not
>> show that parallel I/O is working, then your problem is not with the
>> netCDF layer, but somewhere in HDF5 or even lower in the stack.
Try timing tst_h_par for several different numbers of processors, and
see if you get a performance improvement there.
Thanks,
Ed
--
Ed Hartnett -- ed@xxxxxxxxxxxxxxxx