Some results from AR-5 performance evaluation
As part of analyzing netcdf-4 performance for the upcoming AR-5 climate
data archive, I have been running benchmarks on some AR-4 (3D precip
flux) data that I got from Gary Strand (thanks Gary!)
pr_A1.20C3M_8.CCSM.atmm.1870-01_cat_1999-12.nc.
Here's what's in the file:
netcdf pr_A1.20C3M_8.CCSM.atmm.1870-01_cat_1999-12
{
dimensions:
lon = 256 ;
lat = 128 ;
bnds = 2 ;
time = UNLIMITED ; // (1560 currently)
variables:
double lon_bnds(lon, bnds) ;
double lat_bnds(lat, bnds) ;
double time_bnds(time, bnds) ;
double time(time) ;
time:calendar = "noleap" ;
time:standard_name = "time" ;
time:axis = "T" ;
time:units = "days since 0000-1-1" ;
time:bounds = "time_bnds" ;
time:long_name = "time" ;
double lat(lat) ;
lat:axis = "Y" ;
lat:standard_name = "latitude" ;
lat:bounds = "lat_bnds" ;
lat:long_name = "latitude" ;
lat:units = "degrees_north" ;
double lon(lon) ;
lon:axis = "X" ;
lon:standard_name = "longitude" ;
lon:bounds = "lon_bnds" ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
float pr(time, lat, lon) ;
pr:comment = "Created using NCL code CCSM_atmm_2cf.ncl on\n",
" machine mineral" ;
pr:missing_value = 1.e+20f ;
pr:_FillValue = 1.e+20f ;
pr:cell_methods = "time: mean (interval: 1 month)" ;
pr:history = "(PRECC+PRECL)*r[h2o]" ;
pr:original_units = "m-1 s-1" ;
pr:original_name = "PRECC, PRECL" ;
pr:standard_name = "precipitation_flux" ;
pr:units = "kg m-2 s-1" ;
pr:long_name = "precipitation_flux" ;
pr:cell_method = "time: mean" ;
And here are the first results of putting this data in different sets of chunksizes, with no compression. The first I read all horizontal slabs in the file, then 5 time series. The times show the time to read each slab, and the time to read each time series, in microseconds.
cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
0 0 0 0 0 0 240 3822
1 16 32 1 0 0 667 57087
1 16 128 1 0 0 245 23929
1 16 256 1 0 0 160 26913
1 64 32 1 0 0 277 22840
1 64 128 1 0 0 147 41359
1 64 256 1 0 0 110 47856
1 128 32 1 0 0 205 25052
1 128 128 1 0 0 123 47417
1 128 256 1 0 0 97 68877
10 16 32 1 0 0 552 3284
10 16 128 1 0 0 204 5834
10 16 256 1 0 0 138 8465
10 64 32 1 0 0 233 5268
10 64 128 1 0 0 132 16690
10 64 256 1 0 0 99 28037
10 128 32 1 0 0 180 8414
10 128 128 1 0 0 113 28064
10 128 256 1 0 0 90 54715
256 16 32 1 0 0 8853 1167
256 16 128 1 0 0 8012 3677
256 16 256 1 0 0 118 1581
256 64 32 1 0 0 8170 3737
256 64 128 1 0 0 227 1640
256 64 256 1 0 0 80 1627
256 128 32 1 0 0 645 1624
256 128 128 1 0 0 211 1650
256 128 256 1 0 0 68 1667
1024 16 32 1 0 0 32337 1192
1024 16 128 1 0 0 296 1489
1024 16 256 1 0 0 114 1564
1024 64 32 1 0 0 679 1415
1024 64 128 1 0 0 221 1503
1024 64 256 1 0 0 79 1669
1024 128 32 1 0 0 646 1558
1024 128 128 1 0 0 208 1568
1024 128 256 1 0 0 68 1646
1560 16 32 1 0 0 55064 1055
1560 16 128 1 0 0 298 1438
1560 16 256 1 0 0 115 1477
1560 64 32 1 0 0 685 1425
1560 64 128 1 0 0 225 1545
1560 64 256 1 0 0 79 1589
1560 128 32 1 0 0 658 1535
1560 128 128 1 0 0 208 1567
1560 128 256 1 0 0 68 1544
The first line shows the read times for the classic netcdf file.
I am happy to see there are a number of cases that clearly outperform classic netcdf. The trick is to come up with some algorithm that comes up with the correct answers without the user being involved.