Re: netCDF library

Subject: Re: netCDF library
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Wed, 02 Aug 2006 09:17:00 -0600

More options for compression.

Dave Allured wrote:

Nilesh,
Since Netcdf format is a simple matrix of fixed width cells, there is nosimple way to save space by not storing zero values.
I think you are saying that a standard scientific file format isimportant to you. Since you have had such good luck with gridded datain Netcdf, I suggest that you stay with it. Consider these options toreduce archival file size:
1. Keep your current Netcdf format, but store your files gzip'ed. Makeuncompressing a standard part of opening the file. Many applicationlanguages will allow you to call the shell to gunzip and delete atemporary file, so you can automate this. gunzip is rather fast, as Irecall. As you stated, your file size is reduced by 99%.


The Netcdf-Java 2.2 library looks for  ".Z", ".zip", ".gzip", ".gz", or ".bz2" 
file extensions, and if found, it will uncompress/unzip, then read from the uncompressed file. It caches the unzipped file, and 
can clean up the cache area automatically, deleting older files to keep cache size within a specified limit. The next time the 
file is opened, it first looks to see if the uncompressed version exists in the cache.

This works in read-only applications like servers. Writing usually is done once 
and we havent tried to optimize that.

2. Netcdf 16-bit packed format. Reduce file size by 50%. You get 16bits for your combined precision and dynamic range.
3. Netcdf 8-bit packed format. Reduce file size by 75%. You get 8bits for your combined precision and dynamic range.



If you use the standard attributes "scale_factor" and "add_offset", the 
Netcdf-Java 2.2 library will optionally handle the packing in a transparent way, ie promote the 
variable to float or double from byte or short, and apply the scale and offset. Again, this is only 
on the reading side.


These features are available only to Java applications.

2006 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: