Interactions with the Unidata proposal review panel helped clarify the
philosophy and strategy for pursuing interoperability for our data systems via
standard interfaces. For the core community, we support components for
an end-to-end system, namely, IDD/LDM for real time push data deliver, decoder
suite for format transformations, TDS for "pull" data access from remote
servers, client applications for analysis and display. For other
communities who use different client applications, we provide access to data
via standard interfaces, e.g., the netCDF API, OPeNDAP and WCS. Thus
they can use any client they wish as long as it can access the data via these
interfaces. There are many such clients in use: IDV, PMEL Ferret, ITT
Vis IDL, Matlab, ESRI arcGIS, to name a few. Providing data via these
standard interfaces thus makes the data available to a wide variety of
communities using a broad array of analysis and display tools, but does not
impose a burden of support on the UPC for supporting those tools.
Work on OGC Oceans Interoperability Experiment is winding down. The
final report has been drafted and is under review by the Oceans IE team.
Possible NSF Proposals
CoAHP Proposal led by NCAR GIS Initiative
Olga Wilhelmi continues to revise the proposal according to the guidance of
NSF's Doug James. As you may recall the original proposal was for a
project that would bring the data systems of the Unidata and CUAHSI HIS
together to serve interdisciplinary research. NSF encouraged
us to continue with an expanded proposal that would bring in the ecology
community. But the effort should start with a workshop that would
include representatives of all three disciplines. Olga has now 1)
strengthened the ecology component (and workshop agenda) and 2) provided a
list of potential participants from each community. But the proposal
for the workshop has yet to be reviewed.
Datanet Proposal led by SDSC
Initial steps take for pre-proposal to the NSF Datanet opportunity.
This is being led by the SDSC. There would be a thematic focus
on transforming scientific climate data into forms usable by a broader
audience, e.g., decision makers and the general public. Unidata's role
would relate mainly to providing data and tools for real-time access to
datasets related to extreme weather events. Our role would be similar
to one of our roles in LEAD, namely specialized support for the use of our
tools in the context of the project and informing our community about the
project and disseminating the components of the system to our community
where appropriate. For example some of our sites may be interested in
implementing the systems that transform the research data into forms useful
in an educational setting.
Data System Standards
OGC Technical Committee MeetingHighlights
This is is a brief summary of topics of interest to Unidata and the GALEON
project from the June OGC Technical Committee meetings.
Relating to CF-netCDF, there was discussion of the new encoding format
document for which a draft is nearly ready to be submitted as a WCS
standard extension specification. There is still some concern that
this encoding format specification will be closely coupled to the actual
WCS protocol specification and a suggestion was made that coverage
encoding format documents be submitted as "best practices" documents
rather than standard extensions. However, subsequent discussions via
email and at the meeting led to the conclusion that a "best practices"
approach would lead to a coupling with the standard that was too
loose. So the operative plan now is to continue on the previous path
and submit the CF-netCDF encoding specification as a WCS standard
extension as soon as possible. A much expanded draft of that extension
standard is under review by the GALEON participants.
There was also interest in the fact that we are attempting to map a
variety of scientific data types (e.g., the Unidata CDM scientific data
types, the BADC Climate Science Modelling Language scientific feature
types) as coverages as understood by ISO. This would include
collections of point, station, sounding, trajectory, radar scan, swath,
etc. that are not "gridded" and hence have not traditionally been thought
of as coverages. The ISO 19123 coverage definition does however
include collections of discrete point. For the netCDF community, the
first order of business is to extend and adapt the CF conventions to
encompass these data types fully.
In terms of catalogs, there were discussions with the ESRI reps who
indicated that there may actually be some facilities for tying CSW catalog
information with WCS access in the new arcGIS 9.3. But none of the
people at the meeting has really had enough experience with
these OGC interfaces
in that release to say exactly how that can be done. It's something
I have to follow up on. It was confirmed that the 9.3 WCS
client cannot access CF-netCDF encoded information. So our
initial experiments with the beta release are in fact accessing geoTIFFs
from THREDDS Data Servers. In discussing it, we concluded that
this is not as much of a mystery as it seemed initially. The native
netCDF read/write (from local disk) capability in arcGIS 9.2
actually only brings in 2D "slices" at a time, but that restriction is not
possible in the current WCS implementations.
Lorenzo Bigagli gave an excellent summary of the ESSI (Earth and Space
Science Informatics) sessions at the recent EGU conference. He also
indicated that a new release of the Gi-GO catalog and data access client
is in the works.
The discussions of Google's KML were focused mainly on mass market display
of data in Google Earth and Google Maps. Much of the
interaction was dominated by commercial applications interested in wide
exposure to large segments of the general public. The emphasis is on
the display of data and not so much on the analysis tools needed by the
research community. Obviously there is considerable interest in the
academic community, but, in terms of supporting infrastructure, what might
be of most use is some sort of service that would automate the process of
conversion of netCDF data slices into KML for display.
The meeting presentations are available at:
http://portal.opengeospatial.org/index.php?m=projects&a=view&project_id=82&tab=2&artifact_id=27720
Recent GALEON Activity
The GALEON Interoperability Experiment has been very active
recently -- mainly via a set of email exchanges. The main topic areas
under active discussion are:
1. WCS-netCDF extension standard.
The main issue here is how to incorporate an OPeNDAP option. I sense
agreement that there should be an OPeNDAP option in WCS, but it could be
done in terms of a small addition to the proposed CF-netCDF extension
standard or it could be done as a separate extension spec that leverages
and builds on the CF-netCDF proposal.
To me, this is still the top priority issue and it would be good if it
could be resolved so the resulting WCS extension standard(s?) can be
formally submitted to the WCS1.2.SWG before the next OGC TC meeting which
is the first week in December.
2. Non-gridded coverages.
The issue here is what we do in both the CF community AND the OGC
community about the types of data that do not fit the current WCS
definition of regular grids. This includes the types of datasets
(and collections thereof) that have been listed and discussed as CDM
(Common Data Model) scientific data types and as CSML scientific feature
types. I put this as the second highest priority because it
encompasses important work for both communities:
-- the CF community must extend the conventions to include these data
types -- with special care to explicitly Coordinate Reference System
(CRS) information whereas
-- the OGC community has to come to grips the fact that these data types
can be seen as features and/or coverages and that some harmonization -- or
at least better understanding -- is needed among alternative delivery
protocols (WFS, WCS, SOS, WMS?).
I'd will attempt to spawn separate discussions of these
issues because the CF work should be undertaken sooner rather than
later and the OGC issues are on the agenda for the December TC meeting in
a special joint session the afternoon of Monday Dec. 1.
3. WCS Core and Extensions.
A couple topics have come up here. One is the more general question
of whether there is anything in the current WCS 1.2 draft that prevents
the GALEON community
from addressing its standardization needs in the extension standard(s) for
CF-neCDF (with an OPeNDAP option either as a part of the
CF-netCDF extension or as a separate extension standard built on
the CF-netCDF foundation). The second issue that came up in the
email discussion is more specific: namely whether having the WCS 1.2 core
include only 2D coverages is an obstacle to the service of 4D FES
(metocean) data to the traditional GIS community. In this regard, it
has been noted that, while the current draft WCS 1.2 core spec allows
clients to be compliant even though they cannot work with 3D or 4D
datasets, forcing those clients to deal with a 4D WCS would not mean that
they would be able to analyze and display those datasets. On the
other hand, any client developer whose user community is interest
in GALEON 4D
datasets would implement the CF-netCDF extension. I plan to start a
separate email discussion on this topic to give people a chance to respond
to my obvious prejudice.
4. The Need for Catalog Services.
This was really only touched on in the discussion thread, but I think it
warrants an item in this list because the discussion of a wider variety of
data types along with the possibility of multiple access protocols (WMS,
WCS, SOS) places additional emphasis on the importance of
standards-based catalog services (CS-W) which was one of the issues that
the GALEON phase
1 brought out. Getting CS-W discovery systems working
together with WCS (or other OGC) access systems is a substantial
interoperability challenge when the clients and servers are developed
independently. This item is mainly just a reminder that more
thought and experimentation is needed in this area.
5. Collect a set of GALEON use cases to guide the evolution of relevant
OGC standards. This is being done on the GALEON wiki.
ESRI User Meeting Highlights
There was
a major emphasis on "climate change" workshops, training sessions, general
presentations, etc. this year. But nearly all of this turned out
to have a different focus than what I had anticipated. While there
were a few sessions on the climate research and education topics, the vast
majority of the discussion was about approaches to reducing and mitigating
the effects of CO2 generation. Some topics were at least vaguely
related to data systems of the sort we work with, e.g., climate data for
wind farm siting or solar radiation data for solar energy collection.
But the vast majority of the discussion was about demographic,
infrastructure, land use, biological, governmental zoning, transportation
systems and other such geographic data systems. There were
some surprising and fascinating ideas like planning tranportation systems to
encourage fewer left turns and more right turns.
There were several talks related to the NWS NDFD (National Digital Forecast
Database) which apparently is heavily used in the GIS world One
interesting item is that the NDFD is moving to a 2.5 km resolution for their
data. It was not clear (even after a few questions) exactly how
they get to forecasts with that fine a resolution, but it seems to depend a
lot on human input. Another significant change is that watches and warnings
are now for polygon delimited areas rather than by county which was always
very inaccurate. The most directly relevant presentation for us was
by Eoin Howlett who reported on his work with Roy Mendohlsson of Pacific
Fisheries Environmental Labs. This system enables access to
THREDDS/OPeNDAP services directly from within
the ESRI arcGIS
applications. In essence, this EDC (Environmental Data
Connector)
makes ESRI products
OpeNDAP clients, using the supplied Python scripting language in version 9
of arcGIS, and the Pydap library.
There were also quite a few interactions with
the ESRI staff
involved in implementing OGC/ISO interfaces. In particular, arcGIS 9.3
has a plug in for access to OGC CS-W (Catalog Services for the Web) that
should be able to connect to the CS-W service that GMU has created for
enabling access via standard protocols to THREDDS catalogs. Teddy Matinde
of ESRI and
I has some real success accessing the GMU CS-W server for THREDDS datasets.
The search system turned up a number of North American Model datasets
on motherlode when we looked for "NAM". However, for the search "hits"
we got, there was not a direct pointer to the WCS access point so we were
not able to access the data after the successful searches. To access
the dataset via WCS,
the ESRI client
needs a URL pointing directly to the WCS service.
The key to making this happen at all is the intstallation of the GIS Portal
Toolkit plugin into arcGIS. In a follow up after the meetings, I
finally got this toolkit installed properly on my own computer and was able
to do the same sort of searches we did at the meetings. Working with
our George Mason partners, we discovered that, although the arcGIS search
interface looks like a simple free text search (ala Google), it really only
searches the titles of datasets, so it does not find anything if the search
term is in a field other than the title.
AccessData Workshop
This year's AccessData Workshop was hosted by Unidata. It was held
downtown in Portland, Oregon, the first one in an urban setting
rather than a resort of retreat type of venue and Tina did a masterful job
organizing it. The total number of attendees was 57, with 8 of them
from UCAR. The AccessData (originally called DLESE Data Services)
workshops provide an opportunity for data providers, software tool
specialists, scientific researchers, curriculum developers, and educators to
interact with one another in a variety of sessions, all working toward the
common goal of facilitating the use of data in education. In addition
to keynote presentations, hands-on lab sessions (Tool Time) and a Demo
Session/Share Fair, attendees are grouped into teams that include the full
range of roles represented at the meeting. Team members work together
to develop an educational module, drawing upon the expertise of individuals
in each role on the team. This practical exercise enables team members to
learn from each other about the needs, practices, and expectations of the
other groups. Jeff Weber enthralled the participant with one of
the keynote presentations entitled
"Data,
data everywhere, but not a bit to display." He also provided a
tool time session on the use of the Unidata IDV.
For more details on
the workshop, there's a web page
at: http://serc.carleton.edu/usingdata/accessdata/index.html
ACCESS Geoscience Project Extension
Our NASA ACCESS Geoscience grant was extended with a modest addition of funds
to carry through this calendar year. As noted above, there has been
(limited) success with the arcGIS client finding datasets on THREDDS servers
via the standard CS-W interface on the George Mason server which harvests
metadata from THREDDS sites. In adddition, we've been successful with
the Gi-GO client from the University of Florence. While arcGIS only
searches the Title field, Gi-GO searches both Title and Abstract so it finds
more THREDDS datasets. But neither of them searches in other fields such
as the list of variables in the datasets. Hence a search for "vorticity"
would not get any hits unless that word shows up in the Title or Abstract.
What we are learning, however, is that building interoperable data search
systems that are practical and useful ain't easy. These experiments
appear to be the first where the client and server have been developed
independently that results in searches that are successful at all. As
noted, one dificulty is that the user interface on the clients tends to be a
simple free text search, but the underlying server capability is more of a
precise database query system for which one has to specify which field(s) are
being searched. This mismatch causes many of the difficulties. The
granularity of the searched objects is another key issue. THREDDS
servers support a hierarchy of catalogs of catalogs, so, if one finds a high
level catalog, it must be possible to "drill down" to individual datasets, but
the clients we are working with do not have this capability yet. We are
working with the Gi-GO team to make it work for their client. This would
mean we could use the Gi-GO client to find datasets on THREDDS servers via the
CS-W protocol and then download the dataset via the standard WCS protocol.