[netcdfgroup] Alternate chunking specification

To: netcdfgroup@xxxxxxxxxxxxxxxx
Subject: [netcdfgroup] Alternate chunking specification
From: "dmh@xxxxxxxx" <dmh@xxxxxxxx>
Date: Mon, 15 May 2017 13:29:49 -0600

I am soliciting opinions about an alternate way to specify chunking
for netcdf files. If you are not familiar with chunking, then
you probably can ignore this message.

Currently, one species a per-dimension decomposition that
together determine how a the data for a variable is decomposed
into chunks. So e.g. if I have variable (pardon the shorthand notation)
  x[d1=8,d2=12]
and I say d1 is chunked 4 and d2 is chunked 4, then x will be decomposed
into 6 chunks (8/4 * 12/4).

I am proposing this alternate. Suppose we have
    x[d1,d2,...dm]
And we specify a position 1<=c<m
Then the idea is that we create chunks of size
   d(c+1) * d(c+2) *...dm
There will be d1*d2*...dc such chunks.
In other words, we split the set of dimensions at some point (c)
and create the chunks based on that split.

The claim is that for many situations, the leftmost dimensions
are what we want to iterate over: e.g. time; and we then want
to read all of the rest of the data associated with that time.

So, my question is: is such a style of chunking useful?

If this is not clear, let me know and I will try to clarify.
=Dennis Heimbigner
 Unidata

Follow-Ups:
- Re: [netcdfgroup] Alternate chunking specification
  - From: Dave Allured - NOAA Affiliate
- Re: [netcdfgroup] Alternate chunking specification
  - From: Heiko Klein

2017 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: