Continued from CF Reduced Grids.
I was looking at the CF reduced horizontal grid feature because its really a way to store "ragged arrays" rather than the somewhat more general "compression by gathering".
Example 5.3. Reduced horizontal grid
dimensions:
londim = 128 ;
latdim = 64 ;
rgrid = 6144 ;
variables:
float PS(rgrid) ;
PS:long_name = "surface pressure" ;
PS:units = "Pa" ;
PS:coordinates = "lon lat" ;
float lon(rgrid) ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
float lat(rgrid) ;
lat:long_name = "latitude" ;
lat:units = "degrees_north" ;
int rgrid(rgrid);
rgrid:compress = "latdim londim"
If one examines the rgrid values, you see sequences of lat,lon indices like
0,0 0,1 0,2 ... 0,row0size
1,0 1,1 1,2 ... 0,row1size
2,0 2,1 2,2 ... 0,row2size
3,0 3,1 3,2 ... 0,row3size
...
that is, it could be completely specified by the set of rowSizes, everything else just being an enumeration of the indices of the ragged array. Note that latdim, londim are not actually used, and that rgrid(rgrid) has the same signature as a coordinate variable, although its really part of a more complicated coordinate mapping function.
A more explicit data structure for ragged arrays might look like:
dimensions:
londim = 128 ;
latdim = 64 ;
rgrid = 6144 ;
variables:
float PS(rgrid) ;
PS:long_name = "surface pressure" ;
PS:units = "Pa" ;
PS:coordinates = "lon lat" ;
float lon(rgrid) ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
float lat(latdim) ;
lat:long_name = "latitude" ;
lat:units = "degrees_north"
lat:raggedRowsize = "rowSize";
int rowSize(latdim);
rgrid:ragged= "latdim londim";
rgrid:desc= "number of longitudes for each latitude row";
so that: rgrid size = sum(rowSize)
in this example, the lon coordinate would just be lon(rgrid). to figure
out the lat coordinate, you have to form rowStart(latDim):
rowStart(i) = 0 if i = 0
rowStart(i) = rowStart(i-1) + rowSize(i-1) if i > 0
then find i such that rowStart(i) <= rgrid index < rowStart(i+1).
Note that there is no "coordinate variable" rgrid(rgrid) as in the "compression by gathering" example. So how to associate? The best I could come up with for now is
lat:raggedRowsize = "rowSize";
which doesn't seem quite right. Also, note that londim is not used,
but you kind of need it for rgrid:ragged. I guess is the maximum value,
but you may not know that ahead of time. hmmm.....
There are a number of variants that might be easier to understand:
- store rowStart directly
- store the latitude index into rgrid(rgrid), this would allow random ordering of the points.