Introduction
Unidata is in the process of developing a Zarr [] based variant of netcdf. As part of this effort, it was necessary to implement some support for chunking. Specifically, the problem to be solved was that of extracting a hyperslab of data from an n-dimensional variable (array in Zarr parlance) that has been divided into chunks (in the HDF5 sense). Each chunk is stored independently in the data storage -- Amazon S3, for example.
The algorithm takes a series of R slices of the form (first,stop,stride), where R is the rank of the variable. Note that a slice of the form (first, count, stride), as used by netcdf, is equivalent because stop = first + count*stride. These slices form a hyperslab.
The goal is to compute the set of chunks that intersect the hyperslab and to then extract the relevant data from that set of chunks to produce the hyperslab.
It appears from web searches that this algorithm is nowhere documented in the form of a high level pseudo code algorithm. It appears to only exist in the form of code in HDF5, Zarr, and probably TileDB, and maybe elsewhere.
What follows is an attempt to reverse engineer the algorithm used by the Zarr code to compute this intersection. This is intended to be the bare-bones algorithm with no optimizations included. Thus, its performance is probably not the best, but it should work in all cases. Some understanding of how HDF5 chunking is used is probably essential for understanding this algorithm.
The original python code relies heavily on the use of Python iterators, which was might be considered a mistake. Iterators are generally most useful in two situations: (1) it makes the code clearer -- arguably false in this case, and (2) there is a reasonable probability that the iterators will be terminated before they end, thus improving efficiency. This is demonstrably false for the Zarr code.
This code instead uses the concept of an odometer, which is a way to iterate over all the elements of an n-dimensional object. Odometer code already exists in several places in the existing netcdf-c library, among them ncdump/nciter.c, ncgen/odom.c, and libdap4/d4odom.c. It is also equivalent to the Python itertools.product iterator function.
This algorithm assumes the following items of input: 1. variable (aka array) - multidimensional array of data values 2. dimensions - the vector of dimensions defining the rank of the array; this set comes from the dimensions associated with the array; so given v(d1,d2) where (in netcdf terms d1 and d2 are the dimension names and where for example, d1=10, d2=20 the dimensions given to this algorithm are the ordered vector (d1,d2), or equivalently (10,20). 3. Chunk sizes - for each dimension in the dimension set, there is defined a chunk size along that dimension. Thus, there is also an ordered vector of chunk sizes, (c1,c2) corresponding to the dimension vector (d1,d2). 4. slices - a vector of slice instances defining the subset of data to be extracted from the array. A single slice as used here is of the form (start,stop,stride) where start is the starting position with respect to a dimension, stop is the last position + 1 to extract, and stride is the number of positions to skip when extracting data. Note that this is different than the netcdf-c ncgetvars slices, which are of the form (start,count,stride). The two are equivalent since stop = start+(count*stride). When extracting data from a variable of rank R, one needs to specify R slices where each slice corresponds to a dimension of the variable.
At a high-level, the algorithm works by separately analyzing each dimension of the array using the corresponding slice and corresponding chunk size. The result is a set of projections specific to each dimension. By taking the cross-product of these projections, one gets a vector of projection-vectors that can be evaluated to extract a subset of the desired data for storage in an output array.
It is important to note that this algorithm operates in two phases. In the first phase, it constructs the projections for each dimension. In the second phase, these projections are combined as a cross-product to provide subsetting for the true chunks, which are R-dimensional rectangles.
What follows is the algorithm is written as a set of pseudo-code procedures to produce the final output given the above inputs.
Notations:
- floordiv(x,y) = floor((x / y))
ceildiv(x,y) = ceil((x / y))
Notes:
The ith slice is matched to the ith dimension for the variable
- The ith slice is matched to the ith chunksize for the variable
- The zarr code uses iterators, but this code converts to using vectors and odometers for (one hopes) some clarity and consistency with existing netcdf-c code.
Global Type Declarations
class Slice {
int start
int stop
int stride
}
class SliceIndex { // taken from zarr code see SliceDimIndexer
int chunk0 // index of the first chunk touched by the slice
int nchunks // number of chunks touched by this slice index
int count //total number of output items defined by this slice index
Projection projections[nchunks]; // There are multiple projections
// derived from the original slice:
// one for each chunk touched by
// the original slice
}
class Projection {
int chunkindex;
Slice slice; // slice specific to this chunk
int outpos; // start point in the output to store the extracted data
}
Global variables
In order to keep argument lists short, certain values are assumed to be globally defined and accessible. * R - the rank of the variable * dimlen[R] - the length of the dimensions associated with the variable * chunklen[R] - the length of the chunk sizes associated with the variable * int zeros[R] - a vector of zeros * int ones[R] - a vector of ones
Procedure EvaluateSlices
// Goal: Given the projections for each slice being applied to the
// variable, create and walk all possible combinations of projection
// vectors that can be evaluated to provide the output data
void EvaluateSlices(
Slice slices[R], // the complete set of slices
T output // the target storage for the extracted data: its type is T
)
{
int i;
SliceIndex allsliceindices[R];
Odometer odometer;
nchunks[R]; // the vector of nchunks from projections
int indices[R];
// Compute the slice index vector
allsliceindices = compute_all_slice_indices(slices);
// Extract the chunk0 and nchunks vectors
for(i=0;i<R;i++) {
nchunks[i] = allsliceindices[i].nchunks;
}
// Create an odometer to walk nchunk combinations
odometer = Odometer.new(R,zeros,nchunks); // iterate each "wheel[i]" over 0..nchunk[i] with R wheels
// iterate over the odometer: all combination of chunk indices in the projections
for(;odometer.more();odometer.next()) {
chunkindices = odometer.indices();
ApplyChunkIndices(chunkindices,output,allsliceindices);
}
}
Procedure ApplyChunkIndices
// Goal: given a vector of chunk indices from projections,
// extract the corresponding data and store it into the
// output target
void ApplyChunkIndices(
int chunkindices[R], // indices chosen by the parent odometer
T output, // the target storage for the extracted data
SliceIndex allsliceindices[R]
)
{
int i;
SliceIndex subsliceindices[R];
chunk0[R]; // the vector of chunk0 values from projections
Projection projections[R];
int[R] start, stop,stride; // decomposed set of slices
int[R] outpos; // capture the outpos values across the projections
int ouputstart;
// This is complicated. We need to construct a vector of slices
// of size R where the ith slice is determined from a projection
// for the ith chunk index of chunkindices. We then iterate over
// that odometer to extract values and store them in the output.
for(i=0;i<R;i++) {
int chunkindex = chunkindices[i];
Slice slices[R];
projections[i] = allsliceindices[i].projections[chunkindex];
slices[i] = projections[i].slice;
outpos[i] = projections[i].outpos;
}
// Compute where the extracted data will go in the output vector
outputstart = computelinearoffset(R,outpos,dimlen);
GetData(slices,outputstart,output);
}
Procedure GetData
// Goal: given a set of indices pointing to projections,
// extract the corresponding data and store it into the
// output target.
void GetData(
Slice slices[R],
int chunksize, // total # T instances in chunk
T chunk[chunksize],
int outputstart,
T output
)
{
int i;
Odometer sliceodom,
sliceodom = Odometer.new(R, slices);
// iterate over the odometer to get a point in the chunk space
for(;odom.more();odom.next()) {
int chunkpos = odometer.indices(); // index
}
}
Procedure compute_all_slice_projections
// Goal:create a vector of SliceIndex instances: one for each slice in the top-level input
Projection[R]
compute_all_slice_projections(
Slice slice[R], // the complete set of slices
{
int i;
SliceIndex projections[R];
* for i in 0..R-1
* projections[i] = compute_perslice_projections(dimlen[i],chunklen[i],slice[i])
* return projections
Procedure compute_perslice_projections
Goal:
For each slice, compute a set of projections from it wrt a dimension and a chunk size associated with that dimension.
Inputs:
dimlen -- dimension length
- chunklen -- chunk length associated with the input dimension
slice=(start,stop,stride) -- associated with the dimension
Outputs:
Instance of SliceIndexs
Data Structure:
````
Computations:
- count = max(0, ceildiv((stop - start), stride))
- total number of output items defined by this slice (equivalent to count as used by ncgetvars)
- nchunks = ceildiv(dimlen, dimchunk_len)
- number of chunks touched by this slice
- chunk0 = floordiv(start,chunklen)
- index (in 0..nchunks-1) of the first chunk touched by the slice
- chunkn = ceildiv(stop,chunklen)
- index (in 0..nchunks-1) of the last chunk touched by the slice
- n = ((chunkn - chunk0) + 1)
- total number of touched chunks
- the index i will range over 0..(n-1)
- is this value the same as nchunks?
- For each touched chunk index we compute a projection specific to that chunk, hence there are n of them.
- projections.index[i] = i
- projections.offset[i] = chunk0 * i
- remember: offset is only WRT this dimension, not global
- projections.limit[i] = min(dimlen, (i + 1) * chunklen)
- end of this chunk but no greater than dimlen
- projections.len[i] = projections.limit[i] - projections.offset[i]
- actual limit of the ith touched chunk; should be same as chunklen except for last length because of the min function in computing limit[i]
- projections.start[i]:
- This is somewhat complex because for the first projection, the start is the slice start, but after that, we have to take into account that for a non-one stride, the start point in a projection may be offset by some value in the range of 0..(stride-1)
- i == 0 => projections.start[i] = start - projections.offset[i]
- initial case the original slice start is within the first projection
- i > 0 => projections.start[i] = start - projections.offset[i]
- prevunused[i] = (projections.offset[i] - start) % stride
- prevunused[i] is an intermediate computation and need not be saved
- amount unused in previous chunk => we need to skip (stride-prevunused[i]) in this chunk
- prevunused[i] > 0 => projections.start[i] = stride - prevunused[i]
- prevunused[i] = (projections.offset[i] - start) % stride
- projections.stop[i]:
- stop > projections.limit[i] => projections.stop[i] = projections.len[i]
- stop <= projections.limit[i] => projections.stop[i] = stop - projections.offset[i]
- selection ends within current chunk
- projections.outpos[i] = ceildiv(offset - start, stride)
- "location" in the output array to start storing items; again, per slice, not global
Procedure computelinearoffset(outpos,dimlen);
```` // Goal: Given a set of per-dimension indices, compute the corresponding linear position.
int
computelinearoffset(int R,
int outpos[R],
int dimlen[R]
)
{
int offset;
int i;
offset = 0;
for(i=0;i<R;i++) {
offset *= dimlen[i];
offset += outpos[i];
}
return offset;
}
Appendix: Odometer Code
class Odometer
{
int R; // rank
int start[R];
int stop[R]
int stride[R];
int index[R]; // current value of the odometer
procedure new(int R, int start[R], int stop[R]) { return new(R, start,stop,ones);}
procedure new(int rank, Slice slices[R])
{
int i;
int start0[R];
int stop0[R];
int stride0[R];
R = rank;
for(i=0;i<R;i++) {
start = slices[i].start;
stop = slices[i].stop;
stride = slices[i].stride;
}
for(i=0;i<R;i++) {index[i] = start[i];}
}
boolean
procedure more(void)
{
return (index[0] < stop[0]);
}
procedure next(void)
{
int i;
for(i=R-1;i>=0;i--) {
index[i] += stride[i];
if(index[i] < stop[i]) break;
if(i == 0) break; // leave the 0th entry if it overflows
index[i] = start[i]; // reset this position
}
}
// Get the value of the odometer
int[R]
procedure indices(void)
{
return indices;
}
}