Re: [netcdfgroup] File size, unlimited dimensions, compression and chunks

To: Ross Williamson <rosswilliamson.columbia@xxxxxxxxx>
Subject: Re: [netcdfgroup] File size, unlimited dimensions, compression and chunks
From: Ted Mansell <ted.mansell@xxxxxxxx>
Date: Mon, 9 Jan 2012 10:56:42 -0600

I don't think you can chunk an unlimited dimension by more than 1.  What are 
the variable dimensions?  Your formula makes it sound like they are 1-D and 
only sized by the unlimited dimension.  If that is the case, compression won't 
help.  You might be better off with a netcdf-3 file?

-- Ted

On Jan 9, 2012, at 8:15 AM, Ross Williamson wrote:

> I'm trying to get my head around the filesize of my netcdf-4 file -
> Some background.
> 
> 1) I'm using the netcdf_c++4 API
> 2) I have an unlimited dimensions which I write data to about every second
> 3) There are a set of nested groups
> 4) I'm using compression on each variable
> 5) I'm using the default chunk size which I think is 1 for the
> unlimited dimensions and sizeof(type) for other dimensions
> 6) I take data for 900 samples - There are about 100 variables so I
> would expect (given doubles) a file size of 900x100x4 = 360K. Now I
> fully expect some level of overhead but my file sizes are 5MB which is
> incredibly large.
> 
> Now compression doesn't make much difference (5Mb vs 5.3Mb).  I'm
> assuming here the thing that is screwing me over is that I haven't got
> my chuncking set right. The issue is that I'm rather confused.  It
> appears that you set the chunk size for each variable rather than the
> whole file which doesn't make sense to me.  Would I just say multiply
> each chunk size by say 100 so have 100 for the unlimited dimension and
> sizeof(type)*100 for other dimensions?
> 
> I'd really like to fix this as netcdf-4 seems ideal for my project but
> I can't deal with a size overhead of an order of magnitude.
> 
> I can attach the header of the netcdf file if it helps.
> 
> Ross
> 
> -- 
> Ross Williamson
> Associate Research Scientist
> Columbia Astrophysics Laboratory
> 212-851-9379 (office)
> 212-854-4653 (Lab)
> 312-504-3051 (Cell)
> 
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit: 
> http://www.unidata.ucar.edu/mailing_lists/

Follow-Ups:
- Re: [netcdfgroup] File size, unlimited dimensions, compression and chunks
  - From: Ross Williamson

References:
- [netcdfgroup] File size, unlimited dimensions, compression and chunks
  - From: Ross Williamson

2012 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdfgroup archives: