NOTE: The netcdf-hdf
mailing list is no longer active. The list archives are made available for historical reasons.
Hi, The timings I sent out last Friday were flawed because some of the libraries had been compiled with -g and others with -O. Appended to this note is a complete set of timings with everything compiled with -O optimization. These timings are the results of a benchmarking program which will probably be included in the nctest directory of the next distribution, to give developers on other platforms something to use for tuning. Chris, I've put a copy of the benchmark program, which is standalone, in ~ftp/pub/netcdf/nctime.c, in case you want to use it to explore the HDF prototype performance. As is apparent from the table, there are still a few instances where the HDF implementation is significantly faster than the current netCDF implementation, but in most case the netCDF implementation is significantly faster, and in many cases the new UNIX-specific optimization is much faster than the current netCDF implementation. The performance comparisons below are for netcdfx (netCDF 2.02 with the unreleased UNIX-specific optimization for netCDF), netcdf (the current release, 2.02), and niche (netCDF interface covering HDF encoding). Runs were made on an unloaded SPARCstation 2 (buddy.unidata.ucar.edu). All tests were compiled with -O using Suns unbundled compiler in /usr/lang/cc. Timings are the sum of user and system times as returned by getrusage(2) of enough repetitions of each test to exceed one second of elapsed time. The first column describes the test, where "ncvarget 10x1x30x1" means ncvarget was called to retrieve a 10 by 1 by 30 by 1 hyperslab of the 10 by 20 by 30 by 40 variable. The first dimension (10) is a record variable, and varies most slowly. The benchmark program permits any shape of variable to be used, but 10x20x30x40 was deemed typical. The value in the netcdfx column is the elapsed time in milliseconds for the described test using the netcdfx library. The netcdfx/netcdf column contains the ratio of times for the netcdfx library over the times for the released netcdf library. The last netcdfx/niche column is the ratio of times for the new library and the niche library. Hence values less than 1.0 in the last two columns are expected where the new library performs better than the previous version or the niche library. Ratios significantly greater than 1.0 in either of the last two columns indicate a possible performance problem with the netcdfx library that may bear further study. The same accesses were made for each of the six netCDF types, byte, char, short, long, float, and double. All the timings for the byte variable appear first, followed by the other types. The very first call of ncvarput for the byte variable also includes the time needed to write fill values for the other variables of the five other types. I don't know whether the niche library wrote all these fill values in this case, since nctest doesn't test that. netcdfx netcdfx/netcdf netcdfx/niche ----- byte_var(10,20,30,40) time for ncvarput 10x20x30x40 2936.667 msec .70593 5.68387 time for ncvarget 1x1x1x1 .065 msec .0715859 .0374208 time for ncvarget 10x1x1x1 13.256 msec 1.14771 .769311 time for ncvarget 1x20x1x1 4.163 msec .20345 .134686 time for ncvarget 1x1x30x1 .649 msec .021417 .0144711 time for ncvarget 1x1x1x40 .104 msec .0881356 .0579387 time for ncvarget 10x20x1x1 53.030 msec .279105 .197873 time for ncvarget 10x1x30x1 18.615 msec .0732874 .0470076 time for ncvarget 10x1x1x40 13.488 msec 1.16779 .775841 time for ncvarget 1x20x30x1 15.846 msec .0389655 .02502 time for ncvarget 1x20x1x40 4.747 msec .23375 .1551 time for ncvarget 1x1x30x40 1.268 msec 1.02341 .698623 time for ncvarget 10x20x30x1 148.889 msec .0353936 .0225476 time for ncvarget 10x20x1x40 57.879 msec .302855 .206711 time for ncvarget 10x1x30x40 24.615 msec 1.9846 .517383 time for ncvarget 1x20x30x40 27.385 msec 6.06668 3.18245 time for ncvarget 10x20x30x40 232.000 msec 5.07015 .54375 ----- char_var(10,20,30,40) time for ncvarput 10x20x30x40 270.000 msec 7.68093 .52258 time for ncvarget 1x1x1x1 .065 msec .0715859 .0413749 time for ncvarget 10x1x1x1 13.333 msec 1.14663 .833313 time for ncvarget 1x20x1x1 5.525 msec .270013 .170398 time for ncvarget 1x1x30x1 3.099 msec .106018 .0635197 time for ncvarget 1x1x1x40 .101 msec .0885188 .0550709 time for ncvarget 10x20x1x1 52.424 msec .274312 .213106 time for ncvarget 10x1x30x1 20.769 msec .0817677 .060375 time for ncvarget 10x1x1x40 13.566 msec 1.16667 .881768 time for ncvarget 1x20x30x1 16.923 msec .0419579 .0270048 time for ncvarget 1x20x1x40 6.381 msec .314211 .206445 time for ncvarget 1x1x30x40 3.723 msec 3.07686 1.97716 time for ncvarget 10x20x30x1 153.333 msec .0363348 .0267753 time for ncvarget 10x20x1x40 57.576 msec .299529 .237917 time for ncvarget 10x1x30x40 26.923 msec 2.19815 .710763 time for ncvarget 1x20x30x40 28.308 msec 6.49564 3.44505 time for ncvarget 10x20x30x40 232.000 msec 5.0368 .66923 ----- short_var(10,20,30,40) time for ncvarput 10x20x30x40 580.000 msec .865672 .604167 time for ncvarget 1x1x1x1 .067 msec .0618652 .0361381 time for ncvarget 10x1x1x1 13.023 msec 1.12753 .829913 time for ncvarget 1x20x1x1 9.457 msec .448688 .302992 time for ncvarget 1x1x30x1 3.099 msec .100262 .0691001 time for ncvarget 1x1x1x40 .160 msec .130187 .0970285 time for ncvarget 10x20x1x1 85.882 msec .44168 .338118 time for ncvarget 10x1x30x1 21.385 msec .0835352 .0600702 time for ncvarget 10x1x1x40 14.031 msec 1.13833 .88546 time for ncvarget 1x20x30x1 21.538 msec .0525317 .0338293 time for ncvarget 1x20x1x40 11.163 msec .486978 .354212 time for ncvarget 1x1x30x40 5.525 msec 1.37849 2.6247 time for ncvarget 10x20x30x1 192.222 msec .0451579 .0316675 time for ncvarget 10x20x1x40 102.941 msec .465563 .389928 time for ncvarget 10x1x30x40 44.848 msec 1.08028 .747467 time for ncvarget 1x20x30x40 65.294 msec 1.03738 4.16097 time for ncvarget 10x20x30x40 460.000 msec 1.05343 .663462 ----- long_var(10,20,30,40) time for ncvarput 10x20x30x40 1183.333 msec .698819 .606837 time for ncvarget 1x1x1x1 .063 msec .0492958 .0319959 time for ncvarget 10x1x1x1 12.403 msec 1.03221 .650155 time for ncvarget 1x20x1x1 15.385 msec .662262 .588247 time for ncvarget 1x1x30x1 2.924 msec .0918977 .0810848 time for ncvarget 1x1x1x40 .273 msec .192933 .197112 time for ncvarget 10x20x1x1 141.111 msec .671957 .443745 time for ncvarget 10x1x30x1 24.923 msec .0929963 .0685955 time for ncvarget 10x1x1x40 14.186 msec 1.02812 .743618 time for ncvarget 1x20x30x1 27.077 msec .0634617 .054154 time for ncvarget 1x20x1x40 19.231 msec .698344 .722562 time for ncvarget 1x1x30x40 8.915 msec 1.15719 3.97636 time for ncvarget 10x20x30x1 218.000 msec .0506584 .0285964 time for ncvarget 10x20x1x40 176.667 msec .795797 .474911 time for ncvarget 10x1x30x40 80.588 msec 1.0458 .636219 time for ncvarget 1x20x30x40 131.111 msec 1.00855 4.46183 time for ncvarget 10x20x30x40 976.667 msec .996599 .57115 ----- float_var(10,20,30,40) time for ncvarput 10x20x30x40 1140.000 msec .675889 .552504 time for ncvarget 1x1x1x1 .065 msec .0555081 .0292529 time for ncvarget 10x1x1x1 12.558 msec 1.03845 .618377 time for ncvarget 1x20x1x1 15.194 msec .658404 .460006 time for ncvarget 1x1x30x1 3.041 msec .0974305 .0664583 time for ncvarget 1x1x1x40 .271 msec .195668 .16247 time for ncvarget 10x20x1x1 138.889 msec .657896 .439522 time for ncvarget 10x1x30x1 25.077 msec .0921949 .0671706 time for ncvarget 10x1x1x40 14.651 msec 1.03278 .71601 time for ncvarget 1x20x30x1 26.615 msec .0618953 .0405304 time for ncvarget 1x20x1x40 19.231 msec .710234 .576936 time for ncvarget 1x1x30x40 8.837 msec 1.1588 3.51372 time for ncvarget 10x20x30x1 216.000 msec .0497314 .0283217 time for ncvarget 10x20x1x40 176.667 msec .788692 .48803 time for ncvarget 10x1x30x40 79.412 msec 1.04652 .655696 time for ncvarget 1x20x30x40 128.889 msec 1 4.16995 time for ncvarget 10x20x30x40 966.667 msec 1.00694 .587045 ----- double_var(10,20,30,40) time for ncvarput 10x20x30x40 2256.667 msec .674975 .485305 time for ncvarget 1x1x1x1 .070 msec .0623886 .033557 time for ncvarget 10x1x1x1 13.333 msec 1.13909 .637241 time for ncvarget 1x20x1x1 26.769 msec 1.12258 .817948 time for ncvarget 1x1x30x1 3.528 msec .114142 .0771013 time for ncvarget 1x1x1x40 .481 msec .306174 .266482 time for ncvarget 10x20x1x1 216.000 msec 1.0125 .624277 time for ncvarget 10x1x30x1 35.152 msec .136248 .0886184 time for ncvarget 10x1x1x40 18.462 msec 1.2 .827595 time for ncvarget 1x20x30x1 44.545 msec .105224 .0707063 time for ncvarget 1x20x1x40 33.636 msec 1.0673 .880939 time for ncvarget 1x1x30x40 15.116 msec 1.08936 1.91172 time for ncvarget 10x20x30x1 370.000 msec .0854503 .0448666 time for ncvarget 10x20x1x40 290.000 msec 1.05072 .763158 time for ncvarget 10x1x30x40 137.778 msec 1.0248 .656086 time for ncvarget 1x20x30x40 230.000 msec 1.03604 1.95283 time for ncvarget 10x20x30x40 1890.000 msec 1.03091 .579162
netcdf-hdf
archives: