Hi, netcdfgroup,
Currently, we are trying to use parallel-enabled NetCDF4. We started with
read/write a 5G file and some computation, we got the following timing (in
wall-clock) on a IBM power machine:
Number of Processors Total(seconds) read(seconds) Write(seconds)
Computation(seconds)
seq 89.137 28.206 48.327
11.717
1 178.953 44.837 121.17
11.644
2 167.25 46.571 113.343
5.648
4 168.138 44.043 118.968
2.729
8 137.74 25.161 108.986
1.064
16 113.354 16.359 93.253
0.494
32 439.481 122.201 311.215
0.274
64 831.896 277.363 588.653
0.203
First thing we can see is that when run parallel-enabled code at one processor,
the total
wall-clok time doubled.
Then we did not see the scaling when more processors added.
Anyone wants to share their experience?
Thanks,
Wei Huang
huangwei@xxxxxxxx
VETS/CISL
National Center for Atmospheric Research
P.O. Box 3000 (1850 Table Mesa Dr.)
Boulder, CO 80307-3000 USA
(303) 497-8924