Russ, Dennis and I discussed some of the chunking results yesterday. We
were concerned that the horizontal reads were causing all the data to be
preloaded into cache for the subsequent time series read. So I swapped
the order - now the time series read is done first.
Here's some results. The first line are the results for reading a classic netCDF file.
cs[0] cs[1] cs[2] cache(MB) deflate shuffle read_hor(us) read_time_ser(us)
0 0 0 0 0 0 247 5908
256 64 128 4 0 0 241 2039
256 64 128 32 0 0 168 31384
256 64 128 128 0 0 140 17096
256 64 256 4 0 0 93 2548
256 64 256 32 0 0 136 55722
256 64 256 128 0 0 106 26892
256 128 128 4 0 0 216 2035
256 128 128 32 0 0 152 55488
256 128 128 128 0 0 121 26698
256 128 256 4 0 0 79 2392
256 128 256 32 0 0 188 191120
256 128 256 128 0 0 136 186396
1024 64 128 4 0 0 236 1945
1024 64 128 32 0 0 108356 53812
1024 64 128 128 0 0 220 19551
1024 64 256 4 0 0 89 1930
1024 64 256 32 0 0 89 1864
1024 64 256 128 0 0 209 40942
1024 128 128 4 0 0 222 2065
1024 128 128 32 0 0 220 1833
1024 128 128 128 0 0 227 41183
1024 128 256 4 0 0 77 1894
1024 128 256 32 0 0 76 1839
1024 128 256 128 0 0 199 207533
1560 64 128 4 0 0 234 1885
1560 64 128 32 0 0 233 1850
1560 64 128 128 0 0 161596 14921
1560 64 256 4 0 0 88 1969
1560 64 256 32 0 0 87 1929
1560 64 256 128 0 0 160939 30848
1560 128 128 4 0 0 218 1924
1560 128 128 32 0 0 218 1875
1560 128 128 128 0 0 161316 30876
1560 128 256 4 0 0 77 1857
1560 128 256 32 0 0 76 1797
1560 128 256 128 0 0 76 1796
Again, there are many chunk size selections which beat the classic netCDF file performance.