John J Bates wrote:
HDF group, please pass to Mike Folk...thanks
John, we are continuing to look at how we might implement the common
data model i/o service provider layer on existing satellite data set
formats. When one already has tens of thousands of files, it seems that
you might want to write the format to the API using the i/o service
provider, what we are calling bringing the API to the data (as opposed
to converting all the data to the a new format). The old so-called 1b
format satellite data sets are, unfortunately, in bit packed formats. I
seem to recall that although HDF5 advertises bit manipulation, you
indicated that it was not perhaps really there yet. If this is the
case, then we could not bring the API to the data and we would only look
at bringing the data to the API (i.e., rewrite all the data into a new
format...doable but...).
So, does HDF5 support bit manipulation; within the framework of the
CDM? Perhaps Mike knows the answer.
Thanks, John
Hi John:
So there's 2 parts to this:
1) The IOSP that you have to write to "bring the API to the data" (I like it,
can I steal it?) has to be able to read the file format, bit packing and all, and make it
available for reading through the API. This is sufficient for giving access to the data.
2) If you want to rewrite the data into Netcfd-4/HDF5, then you want the bit-packing abilities of HDF5 in order to save space. However, even if those arent ready, you can still write to N4, but your files will be bigger. So its an optimization that hopefully we will get soon. As I understand the situation, HDF5 has not yet implemented real bit-packing (eg where 100 11 bit values should take about 1100 bits), however they are working on it. It is possible that current compression schemes will work for your case, but this needs testing.
If you want more details, I can ask Ed Hartnett, out Netcdf-4 expert, to
provide them.