Hi John:
Since the UAF meeting in Seattle I have been giving some thought about how to
serve some large, important datasets, such as the raw ICOADS observations or
the WODB observations. While reading over the PointObservation Conventions
proposal on the CF site, while the proposal makes it clear how I might put data
into a netcdf file, it doesn't make clear what the interplay might be with a
service in TDS, and how a possible service might be affected by a very large
dataset without further structure.
So it seems pretty clear that the ICOADS would be points. From the example:
dimensions:
obs = 1234 ;
variables:
double time(obs) ;
time:long_name = "time of measurement" ;
time:units = "days since 1970-01-01 00:00:00" ;
float lon(obs) ;
lon:long_name = "longitude of the observation";
lon:units = "degrees_east";
float lat(obs) ;
lat:long_name = "latitude of the observation" ;
lat:units = "degrees_north" ;
float alt(obs) ;
alt:long_name = "vertical distance above the surface" ;
alt:standard_name = "height" ;
alt:units = "m";
alt:positive = "up";
alt:axis = "Z";
float humidity(obs) ;
humidity:long_name = "specific humidity" ;
humidity:coordinates = "time lat lon alt" ;
float temp(obs) ;
temp:long_name = "temperature" ;
temp:units = "Celsius" ;
temp:coordinates = "time lat lon alt" ;
attributes:
:CF\:featureType = "point";
Now I am assuming that in a TDS implementation of a service, I will be able to
select on the coordinate variables, is that correct? Even so, for something
like ICOADS, obs is quite large and that extract could be quite slow unless
either there is additional structure or the TDS pre-fetches the coordinate
variables much as the present Dapper server does.
Other options would be to say have a file for each 10-degree block, and then
have TDS aggregate over the files - would this be possible. Then the search
would a lot faster when people want time series in a region as opposed to more
synoptic extractions. Would the TDS service be supporting such an option? Or,
as netcdf-4 supports groups, to have 10-degree groups with 2-degree subgroups,
which would work as far as netcdf-4 is concerned, but that is not the same as
TDS knowing what to do with the hierarchy or to take advantage of the structure.
My questions for Profiles (that is for the WODB) are pretty much the same. I
assume that the TDS service will be able to search on the coordinate variables,
is that correct? And I have the issue with the fact that the profile dimension
variable will get quite large and without further structure the search could be
very slow. Adding the same types of structures mentioned above would provide
possible solutions, but only if TDS, as opposed to netcdf4, supported them.
As you may have guessed, these are not theoretical questions - I would really
like to see ICOADS and WODB served as part of the year 2 UAF effort. So now is
a good time to start thinking about how to do it correctly and what the service
will be able to do.
Thoughts?
Thanks,
-Roy
**********************
"The contents of this message do not reflect any position of the U.S.
Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097
e-mail: Roy.Mendelssohn@xxxxxxxx (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"