Last time I described the experiment of writing NCEP model GRIB files to netCDF-4. Here are the raw results of that experiment, using deflate level 3 and no shuffle for netCDF-4 compression:
There are a total of 51 NCEP model runs in this plot, each is one complete forecast run. Lets split the files out by GRIB-1 and GRIB-2:
As you can see, GRIB-2 has significantly better compression than GRIB-1, probably due to the JPEG-2000 wavelet compression. In case you are wondering about the file where netCDF-4 is smaller than GRIB 2 (the point between .5 and .75 ratio), that is RTMA_GUAM_2p5km_20140803_0600.grib2. It has only 17 records in it, and the netCDF-4 file is .561 smaller (440K vs 781K). This file does not use JPEG-2000 compression, and is a small grid (193 by 193) with most of its points over water, and so has more uniform data values.
There are 15 GRIB-1 files, and 36 GRIB-2 files, and the number of records in each file varies widely. If we use the number of records to find the weighted average, we get these results:
- Total over all files:
Weighted average ratio = 2.18 Total # grib records = 400,403
- Total over GRIB-1 files:
Weighted average ratio = 1.32 Total # grib records = 24,933
- Total over GRIB-2 files:
Weighted average ratio = 2.24 Total # grib records = 375,470
Next time: results broken out by the number of bits stored for the variable.