Let me comment on the more important issue of "base arithmetic" and a
more generalized datatype, particularly for dimension variables and
particularly for time.
>>But, the fatal flaw in all this is the terribly limited singly dimensioned
>>variable allowed for time. Those of us who wish to store milliseconds
>>from an epoch are limited to about a fifty day dataset. From a strictly
>>mathematical standpoint we will never need the millisecond resolution for
>>long datasets. From a dataset standpoint, we need to store the bits that
>>come in to do absolute time comparisons.
>Do you need millisecond resolution for more than fifty days, or were
>milliseconds chosen because they were convenient for some other reason.
Milliseconds were imposed upon us as the units for the Upper Atmosphere
Satellite project. We wish to store these exact values so we can exactly
reproduce the data given us and make exact time comparisons. The project
should produce data for some years so yes, 50 days is not enough.
We are not the only project that has decided to use two nc_long
quantities for time. We do need more than 32 bits for time.
>>With the current set of primitives, we could store our time variable as
>>a double. If we strictly interpreted it as a double, then we still could
>>not guarantee exact millisecond accuracy when converting to/from double
>>from/to integers [it would *probably* work]. If we had an explicit 64
>>bit integer type, we could use this as straight milliseconds from an
>>offset but manipulations in the context of 32 bit machines would be awkward.
>>Using two nc_long values fits the bill for us and at least on other site
>>which is using the same exact convention as ours (Julian Day minus 12 hours,
>>millisecond of day). This line of reasoning leads directly to the idea
>>of base arithmetic and a series of integer variables.
>Unfortunately, without a sample implementation of a netCDF library
>using base arithmetic and base vectors, so that all the ramifications
>can be seen, the proposal is unlikely to gain wide acceptance.
My suggestion for base arithmetic does not necessarily depend upon *any*
modification to the netcdf definition. One could define a new data type
but instead, it might be easier to simply use a naming convention within
the present netCDF scheme. The scheme would simply be that in any situation
where now we can use a variable (eg, normal variable or a dimension
variable), we could use a base variable. This "based variable" consists
of the old variable with one additional dimension and a corresponding
attribute having entries for all its bases. Perhaps a few examples will
be helpful (in addition to show the value of this generalization to variables
other than time):
-------------------------
Example 1. Latitude stored as degrees, minutes, and seconds:
lat = 41 ; //just an example, don't get excited
#three = 3; // the naming convention identifies base dimension with a # sign
short lat(lat,#three)
lat:units="degrees\nminutes\nseconds" //give the units of each component
lat:base=0,60,60 //gives the base for each digit
Latitude is stored as three shorts. The first is degrees, then minutes,
and then seconds. Standard base arithmetic applies. If we have more than
60 seconds, we can subtract 60 and add one to the minutes digits.
-------------------------
Example 2. Longitude stored as degrees, minutes, and seconds:
lon = 12; //another example of a fixed dimension
#three = 3; // the naming convention identifies base dimension with a # sign
short lon(lon,#three)
lon:units="degrees\nminutes\nseconds" //give the units of each component
lon:base=360,60,60 //gives the base for each digit
This is very similar to the previous example of three shorts and the
base 60 for minutes and seconds. In this case, however, we see an additional
use for this notation. We can allow the most significant digit to be governed
by "clock arithmetic". In this case, the 360 means that degrees are calculated
mod 360 and can always be brought within the range 0-360.
-------------------------
Example 3. Time stored as days and millisecond of days:
time = UNLIMITED ;//often time is the unlimited dimension
#two = 2; // we need two nc_long variables for time
long time(time,#two)
time:units="day\nmsec"
time:base=0,86400000
Again, standard base arithmetic applies. When the less significant
digit exceeds 86400000, the day digit can be incremented.
-------------------------
If the attribute "base" were missing, one could use as a default
the maximum for that datatype (eg, 2**32 for nc_long). This would give
one a direct way to store *and process* arbitrary precision integers.
Further thought needs to be addressed to the utility of base arithmetic
for floats and doubles.
This convention allows n netcdf primitives (nc_long, nc_short, etc.)
to be used together as an arithmetic quantity. This is a convention
only which requires no changes to the netCDF package. It only requires
additions to routines manipulating netCDF files if they are to work with
these generalized conventions.