I don't think you can chunk an unlimited dimension by more than 1. What are
the variable dimensions? Your formula makes it sound like they are 1-D and
only sized by the unlimited dimension. If that is the case, compression won't
help. You might be better off with a netcdf-3 file?
-- Ted
On Jan 9, 2012, at 8:15 AM, Ross Williamson wrote:
> I'm trying to get my head around the filesize of my netcdf-4 file -
> Some background.
>
> 1) I'm using the netcdf_c++4 API
> 2) I have an unlimited dimensions which I write data to about every second
> 3) There are a set of nested groups
> 4) I'm using compression on each variable
> 5) I'm using the default chunk size which I think is 1 for the
> unlimited dimensions and sizeof(type) for other dimensions
> 6) I take data for 900 samples - There are about 100 variables so I
> would expect (given doubles) a file size of 900x100x4 = 360K. Now I
> fully expect some level of overhead but my file sizes are 5MB which is
> incredibly large.
>
> Now compression doesn't make much difference (5Mb vs 5.3Mb). I'm
> assuming here the thing that is screwing me over is that I haven't got
> my chuncking set right. The issue is that I'm rather confused. It
> appears that you set the chunk size for each variable rather than the
> whole file which doesn't make sense to me. Would I just say multiply
> each chunk size by say 100 so have 100 for the unlimited dimension and
> sizeof(type)*100 for other dimensions?
>
> I'd really like to fix this as netcdf-4 seems ideal for my project but
> I can't deal with a size overhead of an order of magnitude.
>
> I can attach the header of the netcdf file if it helps.
>
> Ross
>
> --
> Ross Williamson
> Associate Research Scientist
> Columbia Astrophysics Laboratory
> 212-851-9379 (office)
> 212-854-4653 (Lab)
> 312-504-3051 (Cell)
>
> _______________________________________________
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> http://www.unidata.ucar.edu/mailing_lists/