Even More Performance Data for NetCDF-4 On AR-4 File
30 December 2009
Here's a set of benchmarking results to ponder:
cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
0 0 0 0 0 0 246 3908
1 16 32 4 0 0 678 158842
1 16 32 32 0 0 703 60649
1 16 32 128 0 0 704 60073
1 16 128 4 0 0 233 50186
1 16 128 32 0 0 253 121418
1 16 128 128 0 0 251 23324
1 16 256 4 0 0 151 40377
1 16 256 32 0 0 167 106875
1 16 256 128 0 0 184 22115
1 64 32 4 0 0 269 49471
1 64 32 32 0 0 286 86265
1 64 32 128 0 0 285 22933
1 64 128 4 0 0 144 39626
1 64 128 32 0 0 158 93995
1 64 128 128 0 0 176 19252
1 64 256 4 0 0 125 59562
1 64 256 32 0 0 127 89096
1 64 256 128 0 0 141 38744
1 128 32 4 0 0 217 45392
1 128 32 32 0 0 224 110270
1 128 32 128 0 0 239 24357
1 128 128 4 0 0 130 55016
1 128 128 32 0 0 137 84879
1 128 128 128 0 0 150 37176
1 128 256 4 0 0 102 82356
1 128 256 32 0 0 108 101119
1 128 256 128 0 0 120 143347
10 16 32 4 0 0 560 3862
10 16 32 32 0 0 574 3023
10 16 32 128 0 0 596 2165
10 16 128 4 0 0 211 6783
10 16 128 32 0 0 222 7965
10 16 128 128 0 0 235 3805
10 16 256 4 0 0 135 9760
10 16 256 32 0 0 148 11471
10 16 256 128 0 0 157 6161
10 64 32 4 0 0 241 6165
10 64 32 32 0 0 250 7304
10 64 32 128 0 0 263 3517
10 64 128 4 0 0 135 18767
10 64 128 32 0 0 145 21202
10 64 128 128 0 0 155 5397
10 64 256 4 0 0 102 31856
10 64 256 32 0 0 110 34256
10 64 256 128 0 0 120 11744
10 128 32 4 0 0 186 9587
10 128 32 32 0 0 198 11277
10 128 32 128 0 0 208 5963
10 128 128 4 0 0 118 31822
10 128 128 32 0 0 127 34339
10 128 128 128 0 0 137 11778
10 128 256 4 0 0 95 66583
10 128 256 32 0 0 101 65303
10 128 256 128 0 0 116 61578
256 16 32 4 0 0 10728 1370
256 16 32 32 0 0 555 1350
256 16 32 128 0 0 566 765
256 16 128 4 0 0 10924 4929
256 16 128 32 0 0 228 5459
256 16 128 128 0 0 247 3526
256 16 256 4 0 0 11189 9961
256 16 256 32 0 0 154 9276
256 16 256 128 0 0 159 5912
256 64 32 4 0 0 10965 4973
256 64 32 32 0 0 263 4827
256 64 32 128 0 0 267 2684
256 64 128 4 0 0 225 1637
256 64 128 32 0 0 163 19207
256 64 128 128 0 0 160 4680
256 64 256 4 0 0 83 1602
256 64 256 32 0 0 139 35278
256 64 256 128 0 0 134 12680
256 128 32 4 0 0 11319 9963
256 128 32 32 0 0 208 9266
256 128 32 128 0 0 212 5892
256 128 128 4 0 0 216 1693
256 128 128 32 0 0 155 35481
256 128 128 128 0 0 148 12731
256 128 256 4 0 0 69 1598
256 128 256 32 0 0 164 160629
256 128 256 128 0 0 155 154782
1024 16 32 4 0 0 42966 1616
1024 16 32 32 0 0 40867 1431
1024 16 32 128 0 0 565 973
1024 16 128 4 0 0 302 1622
1024 16 128 32 0 0 49018 6434
1024 16 128 128 0 0 256 4686
1024 16 256 4 0 0 120 1452
1024 16 256 32 0 0 38800 10148
1024 16 256 128 0 0 169 9740
1024 64 32 4 0 0 681 1463
1024 64 32 32 0 0 40196 5385
1024 64 32 128 0 0 274 4271
1024 64 128 4 0 0 231 1599
1024 64 128 32 0 0 92258 46865
1024 64 128 128 0 0 204 13802
1024 64 256 4 0 0 81 1541
1024 64 256 32 0 0 79 1531
1024 64 256 128 0 0 178 35534
1024 128 32 4 0 0 641 1580
1024 128 32 32 0 0 39225 10224
1024 128 32 128 0 0 223 9788
1024 128 128 4 0 0 221 1622
1024 128 128 32 0 0 221 1613
1024 128 128 128 0 0 194 35413
1024 128 256 4 0 0 68 1574
1024 128 256 32 0 0 68 1562
1024 128 256 128 0 0 169 178748
1560 16 32 4 0 0 64666 1225
1560 16 32 32 0 0 62167 995
1560 16 32 128 0 0 59672 610
1560 16 128 4 0 0 300 1491
1560 16 128 32 0 0 59760 4155
1560 16 128 128 0 0 58138 2532
1560 16 256 4 0 0 128 1631
1560 16 256 32 0 0 74452 9566
1560 16 256 128 0 0 77289 8045
1560 64 32 4 0 0 695 1570
1560 64 32 32 0 0 74121 4917
1560 64 32 128 0 0 78406 3328
1560 64 128 4 0 0 228 1657
1560 64 128 32 0 0 227 1655
1560 64 128 128 0 0 158447 8447
1560 64 256 4 0 0 96 1721
1560 64 256 32 0 0 96 1726
1560 64 256 128 0 0 157660 31986
1560 128 32 4 0 0 637 1621
1560 128 32 32 0 0 75336 9685
1560 128 32 128 0 0 77909 8133
1560 128 128 4 0 0 221 1827
1560 128 128 32 0 0 219 1814
1560 128 128 128 0 0 157832 32289
1560 128 256 4 0 0 76 1754
1560 128 256 32 0 0 75 1751
1560 128 256 128 0 0 76 1750
Posted by $entry.creator.screenName
Does (Cache) Size Matter, Continued...
30 December 2009
Now for this file, I get results that make sense:
bash-3.2$ ./tst_ar4 -h pr_A1_50_16_64.nc
cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
50 16 64 4 1 0 157932 155803
50 16 64 16 1 0 3473 155237
50 16 64 32 1 0 3479 146510
50 16 64 64 1 0 3487 120306
50 16 64 128 1 0 3499 64149
Now the best performance comes from the largest cache.
Posted by $entry.creator.screenName
Does (Cache) Size Matter?
30 December 2009
Some cache size tests for netcdf-4 and ar4 data.
Oddly, increasing the cache here seems to hurt:
./tst_ar4 -h pr_A1_256_128_128.nc
cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
256 128 128 4 0 0 218 1611
256 128 128 16 0 0 9352 34872
256 128 128 32 0 0 134 32464
256 128 128 64 0 0 133 32303
256 128 128 128 0 0 146 12202
The best read time for the time series is a 4 MB chunk cache. Why?
Posted by $entry.creator.screenName
Netcdf-4 Chunking Performance Results on AR-4 3D Data File
30 December 2009
Some results from AR-5 performance evaluation
As part of analyzing netcdf-4 performance for the upcoming AR-5 climate
data archive, I have been running benchmarks on some AR-4 (3D precip
flux) data that I got from Gary Strand (thanks Gary!)
pr_A1.20C3M_8.CCSM.atmm.1870-01_cat_1999-12.nc.
Here's what's in the file:
netcdf pr_A1.20C3M_8.CCSM.atmm.1870-01_cat_1999-12
{
dimensions:
lon = 256 ;
lat = 128 ;
bnds = 2 ;
time = UNLIMITED ; // (1560 currently)
variables:
double lon_bnds(lon, bnds) ;
double lat_bnds(lat, bnds) ;
double time_bnds(time, bnds) ;
double time(time) ;
time:calendar = "noleap" ;
time:standard_name = "time" ;
time:axis = "T" ;
time:units = "days since 0000-1-1" ;
time:bounds = "time_bnds" ;
time:long_name = "time" ;
double lat(lat) ;
lat:axis = "Y" ;
lat:standard_name = "latitude" ;
lat:bounds = "lat_bnds" ;
lat:long_name = "latitude" ;
lat:units = "degrees_north" ;
double lon(lon) ;
lon:axis = "X" ;
lon:standard_name = "longitude" ;
lon:bounds = "lon_bnds" ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
float pr(time, lat, lon) ;
pr:comment = "Created using NCL code CCSM_atmm_2cf.ncl on\n",
" machine mineral" ;
pr:missing_value = 1.e+20f ;
pr:_FillValue = 1.e+20f ;
pr:cell_methods = "time: mean (interval: 1 month)" ;
pr:history = "(PRECC+PRECL)*r[h2o]" ;
pr:original_units = "m-1 s-1" ;
pr:original_name = "PRECC, PRECL" ;
pr:standard_name = "precipitation_flux" ;
pr:units = "kg m-2 s-1" ;
pr:long_name = "precipitation_flux" ;
pr:cell_method = "time: mean" ;
And here are the first results of putting this data in different sets of
chunksizes, with no compression. The first I read all horizontal slabs
in the file, then 5 time series. The times show the time to read each
slab, and the time to read each time series, in microseconds.
cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
0 0 0 0 0 0 240 3822
1 16 32 1 0 0 667 57087
1 16 128 1 0 0 245 23929
1 16 256 1 0 0 160 26913
1 64 32 1 0 0 277 22840
1 64 128 1 0 0 147 41359
1 64 256 1 0 0 110 47856
1 128 32 1 0 0 205 25052
1 128 128 1 0 0 123 47417
1 128 256 1 0 0 97 68877
10 16 32 1 0 0 552 3284
10 16 128 1 0 0 204 5834
10 16 256 1 0 0 138 8465
10 64 32 1 0 0 233 5268
10 64 128 1 0 0 132 16690
10 64 256 1 0 0 99 28037
10 128 32 1 0 0 180 8414
10 128 128 1 0 0 113 28064
10 128 256 1 0 0 90 54715
256 16 32 1 0 0 8853 1167
256 16 128 1 0 0 8012 3677
256 16 256 1 0 0 118 1581
256 64 32 1 0 0 8170 3737
256 64 128 1 0 0 227 1640
256 64 256 1 0 0 80 1627
256 128 32 1 0 0 645 1624
256 128 128 1 0 0 211 1650
256 128 256 1 0 0 68 1667
1024 16 32 1 0 0 32337 1192
1024 16 128 1 0 0 296 1489
1024 16 256 1 0 0 114 1564
1024 64 32 1 0 0 679 1415
1024 64 128 1 0 0 221 1503
1024 64 256 1 0 0 79 1669
1024 128 32 1 0 0 646 1558
1024 128 128 1 0 0 208 1568
1024 128 256 1 0 0 68 1646
1560 16 32 1 0 0 55064 1055
1560 16 128 1 0 0 298 1438
1560 16 256 1 0 0 115 1477
1560 64 32 1 0 0 685 1425
1560 64 128 1 0 0 225 1545
1560 64 256 1 0 0 79 1589
1560 128 32 1 0 0 658 1535
1560 128 128 1 0 0 208 1567
1560 128 256 1 0 0 68 1544
The first line shows the read times for the classic netcdf file.
I am happy to see there are a number of cases that clearly outperform
classic netcdf. The trick is to come up with some algorithm that comes
up with the correct answers without the user being involved.
Posted by $entry.creator.screenName