Re: [netcdfgroup] Alternate chunking specification

Note that I am proposing an second way to specify chunking on a variable. I am not proposing to remove any existing functionality.

But let me restate my question.
Question: what are some good use cases for having a chunking spec
that is different than
    1,1,...1,di,dj,...dm
where di is the full size of the ith dimension of the variable.
Heiko Klein has given a couple of good use cases, and I am looking for
more.
=Dennis



On 5/16/2017 1:30 PM, Dave Allured - NOAA Affiliate wrote:
Dennis,

Are you saying that the original function nc_def_var_chunking will be kept intact, and there will be a new function that will simplify chunk setting for some data scenarios? You are not proposing any changes in the netcdf-4 file format?

--Dave


On Mon, May 15, 2017 at 1:29 PM, dmh@xxxxxxxx <mailto:dmh@xxxxxxxx> <dmh@xxxxxxxx <mailto:dmh@xxxxxxxx>> wrote:

    I am soliciting opinions about an alternate way to specify chunking
    for netcdf files. If you are not familiar with chunking, then
    you probably can ignore this message.

    Currently, one species a per-dimension decomposition that
    together determine how a the data for a variable is decomposed
    into chunks. So e.g. if I have variable (pardon the shorthand notation)
       x[d1=8,d2=12]
    and I say d1 is chunked 4 and d2 is chunked 4, then x will be decomposed
    into 6 chunks (8/4 * 12/4).

    I am proposing this alternate. Suppose we have
         x[d1,d2,...dm]
    And we specify a position 1<=c<m
    Then the idea is that we create chunks of size
        d(c+1) * d(c+2) *...dm
    There will be d1*d2*...dc such chunks.
    In other words, we split the set of dimensions at some point (c)
    and create the chunks based on that split.

    The claim is that for many situations, the leftmost dimensions
    are what we want to iterate over: e.g. time; and we then want
    to read all of the rest of the data associated with that time.

    So, my question is: is such a style of chunking useful?

    If this is not clear, let me know and I will try to clarify.
    =Dennis Heimbigner
      Unidata



_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


netcdfgroup mailing list
netcdfgroup@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/




  • 2017 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: