Unidata is in the process of developing a Zarr [] based variant
of netcdf. As part of this effort, it was necessary to
implement some support for chunking. Specifically, the problem
to be solved was that of extracting a hyperslab of data from an
n-dimensional variable (array in Zarr parlance) that has been divided
into chunks (in the HDF5 sense). Each chunk is stored independently
in the data storage -- Amazon S3, for example.
The algorithm takes a series of R slices of the form (first,stop,stride),
where R is the rank of the variable. Note that a slice of the form
(first, count, stride), as used by netcdf, is equivalent because
stop = first + count*stride. These slices form a hyperslab.
The goal is to compute the set of chunks that intersect the hyperslab
and to then extract the relevant data from that set of chunks to
produce the hyperslab.
[
Read More]
Posted by $entry.creator.screenName