Hi,
A follow-up question to the time thread of a couple of weeks ago:
Is anyone considering a "convention" for implicitly saving time for
variables that are stored at regular time intervals?
As an example, consider N salinity measurements, salt(time):
time(1) salt(1)
time(2) salt(2)
.
.
time(I-1) salt(I-1)
time(I) salt(I)
time(I+1) salt(I+1)
.
.
time(N) salt(N)
where
time(I+1)-time(I) = deltat, 1 <= I <= N-1
and deltat is a constant.
The time array can be reproduced by storing only three numbers: time(1),
N and deltat. It would be nice to save space by storing these numbers
instead of all N times.
One possible solution is to store attributes of the array salt that
look something like:
salt:time_base = time(1);
salt:time_increment = deltat;
This works, but imposes two new attribute "conventions". Note: N is
is the dimension of salt, and, therefore, is already stored using
standard netCDF means.
Another possibility is to retain time as a variable, but dimension it
to have only 1 element and assign NaN or FloatInf or whatever to it.
This would permit an attribute definition more in-line with the time
conventions that seem to be emerging from the discussions a couple of
weeks ago. An example would be:
time:units = deltat @ time(1);
No new attribute conventions are imposed, but this requires the
implicit assumption that, if time is a 1-element empty array, the
times are to be reproduced using the units attribute.
Any suggestions, comments?
Note: The problem of saving space by storing only a base and increment
for a coordinate variable can be extended to regularly-spaced (gridded)
data in multiple dimensions. However, the stored grid becomes a much
smaller part of the file as the number of dimensions increases. As an
example, a 4-d salt array might be regularly-spaced in time, x, y and
z, and stored in a KxLxMxN array where time is Kx1, x is Lx1, y is Mx1
and z is Nx1. For large K,L,M and N the sizes of time, x, y, and z
become a non-issue for reducing file size since salt will be
proportionally larger. Therefore, the 1-d example case is likely to be
the most important one, since storing time as time(1) and deltat
reduces the file to just larger than half its size, as N gets large.
--
Harry L. Jenter hjenter@xxxxxxxxxxxxxxxxxx
U.S. Geological Survey COM: (703) 648-5916 FTS: 959-5916
Mailstop 430, National Center "Sometimes you're the bug.
Reston, Virginia 22092 Sometimes you're the windshield."