Re: [netcdfgroup] compression without effect

What are the actual dimensions of the variable? The rightmost (fastest-varying) 
chunk dimension is 1, which might be a problem. If that is an unlimited 
dimension, then reordering the array would be a good idea, if possible. You 
might try turning the Shuffle filter on, but I don't know how much effect that 
would have. Are you setting the chunk dimensions yourself or relying on the 
default chunking scheme?

On a side note, my experience is that deflatelevel=2 gives a good compromise 
between speed and compression. Higher values tend to yield only modestly better 
compression for the increased computation. Your mileage may vary!

Cheers,

-- Ted

On Dec 18, 2013, at 11:37 AM, Frank Fuchs wrote:

> I tried ncdump -s getting the follwing infos:
> 
> variables:
>   ...
>   data_complex_set_0:_Storage = "chunked" ;
>   data_complex_set_0:_ChunkSizes = 3752, 86, 1 ;
>   data_complex_set_0:_DeflateLevel = 9 ;
> 
> // global attributes:
>    ....     
>    :_Format = "netCDF-4" ;
> 
> Are those chunksizes meaningful? 
> 
> On a different thought. Does netcdf use zlib directly or via the HDF library?
> Something could go wrong there as well, no?
> 
> Thank you! Best,
> Frank
> 
> 
> 
> 
> 2013/12/18 Russ Rew <russ@xxxxxxxxxxxxxxxx>
> Hi Frank,
> 
> > Now I wanted to test compression using the cxx4 interface, enabling it by 
> > ncvar_data.setCompression(true,true,1) for the heaviest of my variables. 
> >
> > However, even for a file filled with constants the files remain as big as 
> > before. 
> > Further tests using nccopy -d9 old.nca new.nca did not result in a 
> > modification of the file size.
> 
> If you use an unlimited dimension, that may prevent compression,
> because it means that each variable is divided into chunks for
> compression, with one record per chunk.  There is significant HDF5
> space overhead for storing lots of tiny chunks, even if they can be
> compressed.
> 
> Two solutions include:
> 
>     1.  If you don't need the unlimited dimension any more, perhaps
>         because no more data will be appended to the files, then convert
>         the unlimited dimension into a fixed-size dimension, resulting in
>         all the values of each variable being stored contiguously, which
>         should be more compressible.
> 
>     2.  If you still need the unlimited dimension, then rechunk the data
>         before compressing it, so the compression can work on larger
>         chunks.
> 
> The nccopy utility can be used for both of these approaches.
> 
> For approach 1:
> 
>     $ nccopy -u orig.nc orig-u.nc        # makes unlimited dimension fixed 
> size
>     $ nccopy -d9 orig-u.nc orig-u-d9.nc  # compresses result
> 
> For approach 2, assuming you have a record dimension "t" with each chunk
> a slice of only one t value:
> 
>     $ nccopy -c t/10 orig.nc orig-c.nc   # chunks t dimension using 10 
> instead of 1
>     $ nccopy -d9 orig-c.nc orig-c-d9.nc # compresses result
> 
> --Russ
> 
> 
> > --===============1981692180==
> > Content-Type: multipart/alternative; boundary=047d7bdc99b29242bc04edc0db6b
> >
> > --047d7bdc99b29242bc04edc0db6b
> > Content-Type: text/plain; charset=ISO-8859-1
> >
> > Hi,
> >
> > I managed to compile netcdf-4.3.0 using mingw-w64 gcc 4.8.1.
> > All I had to disabale was DAP (I have no use for anyway).
> >
> > I tested that I can read and write netcdf files using the newly build .dll
> > Now I wanted to test compression using the cxx4 interface, enabling it by
> > ncvar_data.setCompression(true,true,1) for the heaviest of my variables.
> >
> > However, even for a file filled with constants the files remain as big as
> > before.
> > Further tests using nccopy -d9 old.nca new.nca did not result in a
> > modification of the file size.
> >
> > Any advise?
> >
> > Best,
> > Frank
> >
> > --047d7bdc99b29242bc04edc0db6b
> > Content-Type: text/html; charset=ISO-8859-1
> > Content-Transfer-Encoding: quoted-printable
> >
> > <div dir=3D"ltr">Hi,<div><br></div><div>I managed to compile netcdf-4.3.0 u=
> > sing mingw-w64 gcc 4.8.1.</div><div>All I had to disabale was DAP (I have n=
> > o use for anyway).</div><div><br></div><div>I tested that I can read and wr=
> > ite netcdf files using the newly build .dll</div>
> > <div>Now I wanted to test compression using the cxx4 interface, enabling it=
> >  by=A0</div><div>ncvar_data.setCompression(true,true,1) for the heaviest of=
> >  my variables.=A0<br></div><div><br></div><div>However, even for a file fil=
> > led with constants the files remain as big as before.=A0</div>
> > <div>Further tests using nccopy -d9 old.nca new.nca did not result in a mod=
> > ification of the file size.</div><div><br></div><div>Any advise?</div><div>=
> > <br></div><div>Best,</div><div>Frank</div><div><br></div><div><br></div>
> > </div>
> >
> > --047d7bdc99b29242bc04edc0db6b--
> >
> >
> > --===============1981692180==
> > Content-Type: text/plain; charset="us-ascii"
> > MIME-Version: 1.0
> > Content-Transfer-Encoding: 7bit
> > Content-Disposition: inline
> >
> > _______________________________________________
> > netcdfgroup mailing list
> > netcdfgroup@xxxxxxxxxxxxxxxx
> > For list information or to unsubscribe,  visit: 
> > http://www.unidata.ucar.edu/m
> > ailing_lists/
> > --===============1981692180==--
> 



  • 2013 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: