Netcdf-4 Chunking Performance Results on AR-4 3D Data File

Some results from AR-5 performance evaluation

As part of analyzing netcdf-4 performance for the upcoming AR-5 climate data archive, I have been running benchmarks on some AR-4 (3D precip flux) data that I got from Gary Strand (thanks Gary!) pr_A1.20C3M_8.CCSM.atmm.1870-01_cat_1999-12.nc.

Here's what's in the file:

 netcdf pr_A1.20C3M_8.CCSM.atmm.1870-01_cat_1999-12
 {                                                                                                           
  dimensions:                                                                                                                                                       
          lon = 256 ;                                                                                                                                               
          lat = 128 ;                                                                                                                                               
          bnds = 2 ;                                                                                                                                                
          time = UNLIMITED ; // (1560 currently)                                                                                                                    
  variables:                                                                                                                                                        
          double lon_bnds(lon, bnds) ;                                                                                                                              
          double lat_bnds(lat, bnds) ;                                                                                                                              
          double time_bnds(time, bnds) ;                                                                                                                            
          double time(time) ;                                                                                                                                       
                  time:calendar = "noleap" ;                                                                                                                        
                  time:standard_name = "time" ;                                                                                                                     
                  time:axis = "T" ;                                                                                                                                 
                  time:units = "days since 0000-1-1" ;                                                                                                              
                  time:bounds = "time_bnds" ;                                                                                                                       
                  time:long_name = "time" ;                                                                                                                         
          double lat(lat) ;                                                                                                                                         
                  lat:axis = "Y" ;                                                                                                                                  
                  lat:standard_name = "latitude" ;                                                                                                                  
                  lat:bounds = "lat_bnds" ;                                                                                                                         
                  lat:long_name = "latitude" ;                                                                                                                      
                  lat:units = "degrees_north" ;                                                                                                                     
          double lon(lon) ;                                                                                                                                         
                  lon:axis = "X" ;                                                                                                                                  
                  lon:standard_name = "longitude" ;                                                                                                                 
                  lon:bounds = "lon_bnds" ;                                                                                                                         
                  lon:long_name = "longitude" ;                                                                                                                     
                  lon:units = "degrees_east" ;                                                                                                                      
          float pr(time, lat, lon) ;                                                                                                                                
                  pr:comment = "Created using NCL code CCSM_atmm_2cf.ncl on ",                                                                                     
                          " machine mineral" ;                                                                                                                      
                  pr:missing_value = 1.e+20f ;                                                                                                                      
                  pr:_FillValue = 1.e+20f ;                                                                                                                         
                  pr:cell_methods = "time: mean (interval: 1 month)" ;                                                                                              
                  pr:history = "(PRECC+PRECL)*r[h2o]" ;                                                                                                             
                  pr:original_units = "m-1 s-1" ;                                                                                                                   
                  pr:original_name = "PRECC, PRECL" ;                                                                                                               
                  pr:standard_name = "precipitation_flux" ;                                                                                                         
                  pr:units = "kg m-2 s-1" ;                                                                                                                         
                  pr:long_name = "precipitation_flux" ;                                                                                                             
                  pr:cell_method = "time: mean" ;

And here are the first results of putting this data in different sets of chunksizes, with no compression. The first I read all horizontal slabs in the file, then 5 time series. The times show the time to read each slab, and the time to read each time series, in microseconds.

cs[0]   cs[1]   cs[2]   cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
0       0       0       0         0       0       240          3822
1       16      32      1         0       0       667          57087
1       16      128     1         0       0       245          23929
1       16      256     1         0       0       160          26913
1       64      32      1         0       0       277          22840
1       64      128     1         0       0       147          41359
1       64      256     1         0       0       110          47856
1       128     32      1         0       0       205          25052
1       128     128     1         0       0       123          47417
1       128     256     1         0       0       97           68877
10      16      32      1         0       0       552          3284
10      16      128     1         0       0       204          5834
10      16      256     1         0       0       138          8465
10      64      32      1         0       0       233          5268
10      64      128     1         0       0       132          16690
10      64      256     1         0       0       99           28037
10      128     32      1         0       0       180          8414
10      128     128     1         0       0       113          28064
10      128     256     1         0       0       90           54715
256     16      32      1         0       0       8853         1167
256     16      128     1         0       0       8012         3677
256     16      256     1         0       0       118          1581
256     64      32      1         0       0       8170         3737
256     64      128     1         0       0       227          1640
256     64      256     1         0       0       80           1627
256     128     32      1         0       0       645          1624
256     128     128     1         0       0       211          1650
256     128     256     1         0       0       68           1667
1024    16      32      1         0       0       32337        1192
1024    16      128     1         0       0       296          1489
1024    16      256     1         0       0       114          1564
1024    64      32      1         0       0       679          1415
1024    64      128     1         0       0       221          1503
1024    64      256     1         0       0       79           1669
1024    128     32      1         0       0       646          1558
1024    128     128     1         0       0       208          1568
1024    128     256     1         0       0       68           1646
1560    16      32      1         0       0       55064        1055
1560    16      128     1         0       0       298          1438
1560    16      256     1         0       0       115          1477
1560    64      32      1         0       0       685          1425
1560    64      128     1         0       0       225          1545
1560    64      256     1         0       0       79           1589
1560    128     32      1         0       0       658          1535
1560    128     128     1         0       0       208          1567
1560    128     256     1         0       0       68           1544

The first line shows the read times for the classic netcdf file.

I am happy to see there are a number of cases that clearly outperform classic netcdf. The trick is to come up with some algorithm that comes up with the correct answers without the user being involved.

Posted by: mhermida

Dec 30, 2009

Add new comment

Article Category

NetCDF

Article type

Developer Blog

Netcdf-4 Chunking Performance Results on AR-4 3D Data File

Some results from AR-5 performance evaluation

Add new comment

Plain text

NSF NCAR

UCAR