Narrowing In On Correct Chunksizes For the 3D AR-4 Data
04 January 2010
We're getting there...
It seems clear that Quincey's original advice is good: use large, squarish chunks.
My former scheme of default chunk sizes worked not terribly for the
innermost dimensions (it used the full length of the dimension), but the
use of a chunksize of 1 for unlimited dimensions was a bad one for read
performance.
Here's some read numbers for chunksizes in what I believe is the correct range of chunksizes:
cs[0] cs[1] cs[2] cache(MB) deflate shuffle 1st_read_hor(us) avg_read_hor(us)
0 0 0 0.0 0 0 7087 1670
64 128 256 1.0 0 0 510 1549
128 128 256 1.0 0 0 401 1688
256 128 256 1.0 0 0 384 1679
64 128 256 1.0 1 0 330548 211382
128 128 256 1.0 1 0 618035 420617
Note that the last two are deflated versions of the data, and are 1000 times slower to read as a result.
The first line is the netCDF classic file. The non-deflated HDF5 files
easily beat the read performance of the classic file, probably because
the HDF5 files are in native endianness and the netCDF classic file has
to be converted from big-endian to little-endian for this platform.
What is odd is that the HDF5 files have a higher average read time than
their first read time. I don't get that. I expected that the first read
would always be the longest wait, but once you started, subsequent reads
would be faster. But not for these uncompressed HDF5 files. I am clearing the cache between each read.
Here's my timing code:
/* Read the data variable in horizontal slices. */
start[0] = 0;
start[1] = 0;
start[2] = 0;
count[0] = 1;
count[1] = LAT_LEN;
count[2] = LON_LEN;
/* Read (and time) the first one. */
if (gettimeofday(&start_time, NULL)) ERR;
if (nc_get_vara_float(ncid, varid, start, count, hor_data)) ERR_RET;
if (gettimeofday(&end_time, NULL)) ERR;
if (timeval_subtract(&diff_time, &end_time, &start_time)) ERR;
read_1_us = (int)diff_time.tv_sec * MILLION + (int)diff_time.tv_usec;
/* Read (and time) all the rest. */
if (gettimeofday(&start_time, NULL)) ERR;
for (start[0] = 1; start[0] < TIME_LEN; start[0]++)
if (nc_get_vara_float(ncid, varid, start, count, hor_data)) ERR_RET;
if (gettimeofday(&end_time, NULL)) ERR;
if (timeval_subtract(&diff_time, &end_time, &start_time)) ERR;
avg_read_us = ((int)diff_time.tv_sec * MILLION + (int)diff_time.tv_usec +
read_1_us) / TIME_LEN;
Posted by $entry.creator.screenName
File Size and Chunking in NetCDF-4 on AR-4 Data File
04 January 2010
Trying to pick chunksizes can be hard!
chunk sizes Size Difference (bytes)
1_128_128 0.33
1_128_256 0.25
1_128_32 0.86
1_16_128 1.56
1_16_256 0.86
1_16_32 5.75
1_64_128 0.51
1_64_256 0.33
1_64_32 1.56
10_128_128 0.18
10_128_256 0.17
10_128_32 0.23
10_16_128 0.3
10_16_256 0.23
10_16_32 0.72
10_64_128 0.2
10_64_256 0.18
10_64_32 0.3
1024_128_128 64.12
1024_128_256 64.12
1024_128_32 64.12
1024_16_128 64.12
1024_16_256 64.12
1024_16_32 64.13
1024_64_128 64.12
1024_64_256 64.12
1024_64_32 64.12
1560_128_128 0.16
1560_128_256 0.16
1560_128_32 0.16
1560_16_128 0.16
1560_16_256 0.16
1560_16_32 0.16
1560_64_128 0.16
1560_64_256 0.16
1560_64_32 0.16
256_128_128 30.57
256_128_256 30.57
256_128_32 30.57
256_16_128 30.58
256_16_256 30.57
256_16_32 30.59
256_64_128 30.57
256_64_256 30.57
256_64_32 30.58
classic 0
Posted by $entry.creator.screenName
NetCDF-4 AR-4 Timeseries Reads and Cache Sizes
04 January 2010
Faster time series for the people!
What HDF5 chunk cache sizes are good for reading timeseries data
in netCDF-4? I'm sure you have wondered - I know I have. Now we know:
.5 to 4 MB. Bigger caches just slow this down. Now that came as a
surprise!
The first three numbers are the chunk sizes of the 3 dimensions of the
main data variable. The next two columns show the deflate (0 = none) and
shuffle filter (0 = none). These are all the same for every run,
because the same input file is used for all these runs - only the chunk
cache size is changed when (re-)opening the file. The Unix file cache is
cleared between each run.
The two times shows are the number of micro-seconds to read a
time-series of the data, and the average time to read a time series
after all time series are read.
*** Benchmarking pr_A1 file pr_A1_256_128_128.nc with various HDF5 chunk caches...
cs[0] cs[1] cs[2] cache(MB) deflate shuffle 1st_read_ser(us) avg_read_ser(us)
256 128 128 0.5 0 0 1279615 2589
256 128 128 1.0 0 0 1279613 2641
256 128 128 4.0 0 0 1298543 2789
256 128 128 16.0 0 0 1470297 34603
256 128 128 32.0 0 0 1470360 34541
Note that for cache sizes of < 4 MB, the first time series read took
1.2 - 1.3 s, and the average time was .0025 - .0028 s. But when I
increased the chunk cache to 16 MB and 32MB, the time for the first read
went to 1.5 s, and the avg time for all reads went to .035 s - an order
of magnitude jump!
I have repeated these tests a number of times, always with this result for chunk cache buffers 16 MB and above.
I am planning on changing the netCDF-4.1 default to 1 MB, which is the
HDF5 default. (I guess we should have listened to the HDF5 team in the
first place.)
Posted by $entry.creator.screenName