Re: UNIT8 and floating point data (fwd)

Eric Pepke <pepke@xxxxxxxxxxxx> writes:

> I'd like to use UNIT8 variables to represent arbitrary floating point
> fields in cases where total storage is more important than precision.
> Looking through the documentation, I couldn't find any obvious,
> standard, or semi-standard way of associating floating point values with
> bytes.  Ideally, I want a table with 256 FLOAT32 elements that I can
> just do a lookup with.  I can do that automatically and quickly
> internally to my visualization program, and I've used it for radar data
> in the NEXRAD format already.

One way to do this is to pack floating-point numbers into ncbyte or ncshort
values and use the conventional netCDF attributes `scale_factor' and
`add_offset' to store the packing parameters, as described in the User's
Guide:


    `scale_factor'
         If present for a variable, the data are to be multiplied by this
         factor after the data are read by the application that accesses
         the data.

    `add_offset'
         If present for a variable, this number is to be added to the data
         after it is read by the application that accesses the data.  If
         both `scale_factor' and `add_offset' attributes are present, the
         data are first scaled before the offset is added.  The attributes
         `scale_factor' and `add_offset' can be used together to provide
         simple data compression to store low-resolution floating-point
         data as small integers in a netCDF file.  When scaled data are
         written, the application should first subtract the offset and then
         divide by the scale factor.

         When `scale_factor' and `add_offset' are used for packing, the
         associated variable (containing the packed data) is typically of
         type byte or short, whereas the unpacked values are intended to
         be of type float or double.  The attributes `scale_factor' and
         `add_offset' should both be of the type intended for the unpacked
         data, e.g. float or double.

The netCDF library doesn't treat these attributes in any special way, so you
have to use their values for packing before you write values and unpacking
after you read values.  As an example, if you want to pack floating-point
values between 950 and 1050 into 8-bit bytes for a program variable named
`x' that is to be stored into a netCDF variable named x_packed, the
structure of the netCDF file might include a data specification like the
following:

    variables:
        ...
        byte x_packed(n);
                x_packed:scale_factor = 0.3937;
                x_packed:add_offset = 950;
                x_packed:_fillValue = 255;
         ...

where we just use the minimum value, 950, for the offset to keep all packed
values positive, and we compute the scale factor by using

        scale_factor = (Max - Min)/(2^Nbits - 2) 
                     = (1050 - 950) / (256-2)
                     = 0.39370079

Now before you store the value x, you pack it with the formula:

        x_packed = (x - add_offset) / scale_factor

and you store the byte value x_packed (which will be between 0 and 254)
instead.  You can use the byte value 255 for a missing value.

Similarly, when you read the data back in, you can unpack it using the
formula:

        x = (x_packed - 1)*scale_factor + add_offset

If you need more than 8-bits of precision but you still want to each value
as one netCDF value, you will have to use 16-bit shorts, and then the
formula above will use Nbits = 16 instead of Nbits = 8.

If you are using C, you may have to declare x_packed to be an `unsigned
char' to get these formulas to work out, or change the formulas to assume
signed values.  In Fortran there are no unsigned integers, so change the
formulas to use signed integers instead.

There are other techniques for accessing packed netCDF data (using the units
attribute to encode packing information, packing values into a bland array
of bytes with some other packing technique and storing the technique name as
a variable attribute, etc.) but the one I've outline above is probably the
simplest.

----------------------------------------------------------------------------
Russell K. Rew                                          UCAR Unidata Program
russ@xxxxxxxxxxxxxxxx                                          P.O. Box 3000
                                                      Boulder, CO 80307-3000
----------------------------------------------------------------------------


  • 1994 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: