After getting Ken Prada's recent post to this group mentioning the OarS and
BoaT (cute) system, I ftp'd and printed out the two PS files mentioned that
provide information and overview of the system(s). I liked what I saw but
have a question pertaining to netCDF useage. I'm sending it to this group
to keep the discussion in the public domain, rather than e-mailing to
Ken direct.
QUESTION: The Compliance list contains the statement: "All data sets
should be configured in four dimensions." My question is this: do you
expect the data from each instrument to have 4 dimensions (I assume X,
Y, Z, and Time), or can some be globals? I'll pose the following
example and tell me if it's what you have in mind. Assume the following
instruments...
- a GPS engine (giving X and Y)
- a clock (giving Time)
- an air temperature guage (at sea level)
- a sea temperature guage (5 meters down)
- an ADCP system (putting out 5 meter bins down to 200 meters)
Assume that everything samples at the same rate.
Now, X, Y, and Time can be global to the data set for all instruments,
so we start our CDL file as...
dimensions:
time = unlimited;
variables:
long time(time); // seconds since my cat's birthday
float gps_lat (time); // Y
float gps_lon (time); // X
(I've left out the long_name, source, etc. to simplify things)
Now we stick in our air and sea temp. They have fixed Z values, so how
do we put them in? Like this?
float air_temp (time);
air_temp:z = "0.0"
float sea_temp (time);
sea_temp:z = "5.0"
or this?
float air_temp (time, 2);
float sea_temp (time, 2);
where temp(time, 1) holds the temperature, and temp(time,2) holds the Z
value. This wastes a lot of space to store the same number over and over.
Or do we make global variables air_temp_z and sea_temp_z and set them to
0.0 and 5.0 respectively? I assume we don't do this...
float air_temp (time, 5);
float sea_temp (time, 5);
where the dimensions are X, Y, Z, Time, and Temperature. That would
create a GREAT amount of data redundancy.
And what about the ADCP? Each bin has an associated depth, so do you do
something like this?
float n_s_vector (time, 3);
float e_w_vector (time, 3);
where the 3 dimensions hold speed, heading, and Z for each sample? Or
do you make the Z values fixed in a global variable since they probably
don't change.
I hope I've not been too confusing, but my question is, when I have many
variables with a fixed X, Y, and Z, and some that have an independant Z,
how does the OarS/BoaT system intend to deal with this? I could guess outright
that if you only put one type of data in a file, then life would be
simple, except then you would be managing a huge amount of data files in a
large underway system -- not to mention the great duplication of data in
X, Y, and Time.
The reason I ask about this is that the OarS/BoaT netCDF specifications are
very, very similar to what we use at OSU, and I would like to see us
all working in a somewhat unified scheme so that not only can we
exchange files simply via netCDF, but we can also deal with format
standards, at least for base data like X, Y, Z, and Time. Some kinds of
"conversion" are easy, like adding the EPIC codes attributes to an
existing file, etc., but others become much more complex without some
kind of good foundation.
Comments? Let's keep this in the netcdfgroup domain. How about calling
this discussion DINGY (Data Integrity via NetCDF Global Yore), CANOE
(Careful Analysis of NetCDF Organizational Effort), or PUNT (People
United for NetCDF Translation). More names, please!
| || Tim Holt / Marine Technician / RV Wecoma
+--==o_____+-/|--+|| College of Oceanography / Oregon State
_____| R/V WECOMA ~-----/ Corvallis, OR USA, 97331-5503 (503)737-4447
+------------------------' holtt@xxxxxxxxxxxx