« Compression by Scali... | Main | GRIB bitmaps are not... »

05 September 2014

Untitled Document

Another way of limiting the precision of floating point numbers is to set the lower order bits of the mantissa to zero, called bit-shaving for some reason. Its quite simple to do, we just convert the float to the identical bits in an integer, then AND it with a mask that sets the lower bits to 0:

  static private int allOnes = 0xffffffff;

  static public int getBitMask(int bitN) {
    if (bitN >= 23) return allOnes;
    return allOnes << 23-bitN;        // rightmost bits get set to zero
  }

  /**
   * Shave n bits off the float
   * @param value    original floating point
   * @param bitMask  bitMask from getBitMask()
   * @return modified float
   */
  static public float bitShave(float value, int bitMask) {
    if (Float.isNaN(value)) return value;  

    int bits = Float.floatToRawIntBits(value);
    int shave = bits & bitMask;
    if (debug) System.out.printf("0x%s -> 0x%s : ", Integer.toBinaryString(bits), 
       Integer.toBinaryString(shave));
    float result = Float.intBitsToFloat(shave);
    if (debug) System.out.printf("%f -> %f %n", value, result);
    return result;
  }

If you want N bits of precision, the mask has the lower (23 - N) bits set to zero. The reletive precision of the shaved data will be 1 / 2^N. Try it out:

   public static void checkShavedPrecision(float[] org, int bits) {
    double expectedPrecision = Math.pow(2.0, -bits);
    float max_pdiff = - Float.MAX_VALUE;
    int bitMask = Grib2NetcdfWriter.getBitMask(bits);
    for (int i=0; i< org.length; i++) {
      float d = org[i];
      float shaved = Grib2NetcdfWriter.bitShave(org[i], bitMask);
      float diff = Math.abs(d - shaved);
      float pdiff = (d != 0.0) ? diff / d : 0.0f;
      assert pdiff < expectedPrecision;
      max_pdiff = Math.max(max_pdiff, pdiff);
    }

    if (max_pdiff != 0.0) {   // usually max precision lies between 1/2^N and 1/2^(N+1)
      if (max_pdiff < expectedPrecision / 2 || max_pdiff > expectedPrecision)
        System.out.printf("nbits=%d max_pdiff=%g expect=%g%n", bits, max_pdiff, 
           expectedPrecision);
    }
  }

Compression with scale/offsets allows you to set the absolute precision of floating point data to an arbitrary value, and bit-shaving allows you to set relative precision to a factor of 2:

  
  Math.abs((UF - F)/F) < 1/2^N

where:
   F = original number
  UF = shaved value
   N = number of bits

Posted by $entry.creator.screenName [ Comments [3] ]

Comments:

Hi John,

Nice post. A few years ago, as part of a short study looking into a variety of netcdf data compression options, I wrote a handful of equivalent bit-shaving functions in python.

I've posted a couple of variants for bit-shaving 32-bit floats on my gist site - see https://gist.github.com/rockdoc/ecac8b1c5f3f99079aac. The 64-bit versions are pretty similar (I can provide them if anyone's interested.)

One question: When you speak of the bit-shaving method giving a relative precision of 1 / 2^N, what is this relative to?

Regards,

Phil

Posted by Phil Bentley on September 07, 2014 at 09:23 PM MDT #

Hi Phil:

Sorry I didnt make that clear. Its relative to the original floating point number:

Math.abs((UF - F)/F) < 1/2^N

where:
F = original number
UF = shaved value

Ill add that to the text. Thanks for your comment.

John

Posted by John Caron on September 08, 2014 at 06:17 AM MDT #

I downloaded significant wave height data from 'ftp://polar.ncep.noaa.gov/pub/history/waves/",which I need to convert GRIB2 files in Matlab environment,to a format such as Excel, but I do not know how to go about it. Since I am not good at using Matlab. Can someone please help out with what to do? Bearing in mind that I am not good at using Matlab. You can help out by composing what I need to write to Matlab to retrieve the data. Thanks!

Posted by Phil on September 11, 2014 at 12:51 PM MDT #