Dennis,
Are you saying that the original function nc_def_var_chunking will be kept
intact, and there will be a new function that will simplify chunk setting
for some data scenarios? You are not proposing any changes in the netcdf-4
file format?
--Dave
On Mon, May 15, 2017 at 1:29 PM, dmh@xxxxxxxx <dmh@xxxxxxxx> wrote:
> I am soliciting opinions about an alternate way to specify chunking
> for netcdf files. If you are not familiar with chunking, then
> you probably can ignore this message.
>
> Currently, one species a per-dimension decomposition that
> together determine how a the data for a variable is decomposed
> into chunks. So e.g. if I have variable (pardon the shorthand notation)
> x[d1=8,d2=12]
> and I say d1 is chunked 4 and d2 is chunked 4, then x will be decomposed
> into 6 chunks (8/4 * 12/4).
>
> I am proposing this alternate. Suppose we have
> x[d1,d2,...dm]
> And we specify a position 1<=c<m
> Then the idea is that we create chunks of size
> d(c+1) * d(c+2) *...dm
> There will be d1*d2*...dc such chunks.
> In other words, we split the set of dimensions at some point (c)
> and create the chunks based on that split.
>
> The claim is that for many situations, the leftmost dimensions
> are what we want to iterate over: e.g. time; and we then want
> to read all of the rest of the data associated with that time.
>
> So, my question is: is such a style of chunking useful?
>
> If this is not clear, let me know and I will try to clarify.
> =Dennis Heimbigner
> Unidata
>