LEAD at Unidata
Status Update,
LEAD had a presence this past January at the 2008 AMS 88th
Annual Meeting in
The workshop was titled "AMS Workshop on Linked Environments for Atmospheric Discovery (LEAD): An Emergent Information Technology Environment for On-Demand, Dynamically Adaptive Interaction with Weather for Research and Education". It was organized by Rich Clark and run by Tom with assistance from Anne. In the workshop, participants invoked a mining orchestration that mined archival radar data covering hurricane Katrina. Participants also ran WRF forecasts over a domain of their choice. Both results were visualized with the IDV.
Sergio Mendez and David Ribes (University of Michigan School of Information) surveyed workshop participants before and after the workshop and wrote a paper assessing participants’ impressions and attitudes, which is viewable here. There was generally a very positive feeling about the power of LEAD, though participants felt it was not quite ready for classroom use.
There was also an IIPS session devoted to LEAD. Tom gave a talk entitled “The LEAD testbed system at the Unidata program center: a medium term online repository of meteorological data”. Anne gave a talk retitled as “Programmatic Population of Scientific Data Repository”. Abstracts and recorded presentation from that session are available here.
After introduction of the LEAD Fault Tolerance and
Recovery service (FTR) in late 2007, reliability of the LEAD system seemed to
improve significantly. The FTR is designed to detect when a portion of a given
LEAD workflow fails and restart that portion on another TeraGrid resource.
During the month of December 2007, reliability of end-to-end LEAD workflows
skyrocketed to above 90% for real time workflows. The addition of FTR came in
time for the LEAD workshop at the AMS meeting in
During the time period beginning in early January, problems with many of the TeraGrid resources LEAD relies upon (particularly GridFTP and GRAM) increased. This exposed an underlying flaw in these elements of the TeraGrid stack causing LEAD workflows to fail at an increasing rate despite the FTR. This became endemic by early February and motivated extensive cross project collaboration to address these problems. The TeraGrid team has been very responsive, providing new releases to support LEAD and other TeraGrid users, but to date, the failure rate is still quite high (in excess of 50%).
With LEAD's support of WxChallenge of 2007 and the success of supporting 10 universities for that endeavor coupled with the increasing reliability of LEAD workflows experienced in late 2007, the LEAD team agreed to support all the participants of WxChallenge 2008. Unfortunately, the problems with TeraGrid mentioned above resulted in most WxChallenge users abandoning using LEAD for their forecasting efforts. With just a few weeks of the WxChallenge remaining at the time of this writing, it seems unlikely that we will be able to leverage the contest.
A service interface for the TDR was designed
and written that provides programmatic access to the TDR. The interface is described here. A TDR installation was created on the Unidata
LEAD test bed and provided to the LEAD developers at
The crosswalk code was updated to crosswalk special THREDDS keywords to LEAD metadata schema keywords. The immediate benefit is to allow the LEAD orchestration to know how to display some datasets by adding entries to the THREDDS catalogs that describes what visualization tools can be applied, such as for those datasets that can be viewed by the IDV. More generally, arbitrary mappings from THREDDS keywords to LEAD keywords can be handled.
We have had success in bringing in the most recent
two days worth of WSR 88-D level II radar data into the LEAD cataloging system. The
THREDDS catalog to these data is found here: http://lead.unidata.ucar.edu:8080/thredds/lead/leadradarsl2.html
As part of this, radar coverage information was derived for the level II radar data and encoded into the THREDDS catalog. This allows the LEAD Geo-Gui to define locations of interest and obtaining only the radar data that covers that region.
The Unidata LEAD test bed continues to be a primary resource of data for LEAD workflows. This includes:
Grib2 data is now fully supported.
We are exploring how our community can benefit to an even greater extent from this resource.
The LEAD PIs continue to strategize about securing funding for a continued LEAD deployment facility. (Continued LEAD CS research would be pursued under an OCI CDI initiative, though that would likely not involve Unidata.) Currently we are discussing the novel possibility of creating a consortium of projects with similar scientific goals and technical requirements, logically grouped as a TeraGrid Gateway Resource Provider.
Proposing such an RP facility essentially means proposing a Track II (mid-range high-performance computing) system along with the supporting management infrastructure to run it. This is a radical and ambitious idea. TeraGrid is currently undergoing a planning process for its phase 2 and may undergo significant changes. Thus, the timing for this possibility is good. LEAD PIs are writing a position paper to be used in engaging other like minded organizations them in this idea.
In an effort funded by Microsoft, some LEAD PIs have been involved in discussions with Microsoft to deploy LEAD as a demonstration application on their new multicore architectures, another alternative would be to simply package LEAD as part of the Microsoft Weather workbench plan and give it away. It would be valuable to anybody with a 16 core server and another 16 to 60 core cluster. This would be a pure LEAD solution and not the consortium proposed above, and is being led by Prof. Dennis Gannon at Indiana University.
A notice of intent (NOI) has been submitted in response to the Hurricane Science Research solicitation (A.16) of the NASA Research Opportunities in Space and Earth Sciences with Mohan as the PI and CO-PIs: Craig Mattocks, Kelvin Droegemeier, Anne Wilson and Tom Baltzer.
The NOI is titled “A satellite data access and visualization system to
support hurricane Research”. In it
we have proposed to:
·
Build a data base that allows users to create
Case Studies of Hurricane data to include all Common Data Model (CDM) supported
datasets -these data would be served using the Thematic Real-time Environmental
Data Services (THREDDS) Data Server (TDS) and the Abstract Data Distribution Environment
(ADDE).
·
Extend the Common Data Model to provide access
to the satellite datasets mentioned in the solicitation (TRMM, Aqua, QuickSCAT, Jason, GOES,NPOES, etc.) providing a common
framework for dissemination and using the Integrated Data Viewer (IDV), a
common framework for visualization.
·
Leverage the Unidata Next Generation Case
Studies project to upload and store these datasets, associated metadata, IDV
bundles and documentation.
·
Extend the MADIS WRF-Var3D DA structures to make
use of the CDM interfacing so that these hurricane datasets could be used for
data assimilation and numerical model initialization.
·
Create a facility to crosswalk the data sets
into the Linked Environments for Atmospheric Discovery (LEAD see:
http://leadproject.org)such that these datasets will be made available to users
of LEAD thus making the data base developed by the proposed project to
researchers, educators and students.
The proposal itself is due May 16.