LEAD at Unidata

Status Update, October 15, 2007

Tom Baltzer, Mohan Ramamurthy, Anne Wilson

LEAD Post Year 5 Planning

The LEAD project is entering its fifth and final year of its period of performance under ITR funding. In anticipation of the expiration of the current award, the LEAD PIs have been engaged in a planning effort to determine the future of the project and seek new funding to continue the project beyond year five. For the purposes of this report, we'll refer to this second phase of LEAD as LEAD-2.

LEAD PIs concluded that a two-pronged approach to seeking funding was appropriate, where deployment of a hardened LEAD facility is provided to a user community and separated from continued computer science research. Thus, at least two separate sources of funding would be targeted. The NSF CISE program in general and the soon-to-be-announced Cyber-enabled Discovery and Innovative (CDI) initiative in particular seem to be potential sources of funding for the computer science research portion of LEAD-2. Potential NSF funding sources for a deployed facility will likely be NSF GEO/ATM, possibly along with the CDI program.

Over the summer LEAD PIs requested a meeting with NSF to engage in a dialogue where LEAD would present its vision for LEAD-2 and NSF officials would provide LEAD with feedback regarding NSF goals and funding possibilities. This meeting took place in Arlington on August 30, 2007.

In planning for a possible LEAD deployment proposal, LEAD team members provided documentation for their software. Additionally, Unidata LEAD team members gathered and compiled information, producing an internal report discussing LEAD software with some observations and recommendations.

A technical meeting was held in early August where options were discussed for creating a hardened, deployed facility. The outcome of this meeting was a moderately defined plan regarding portions of existing LEAD software to keep, portions to replace (some with Unidata technology) and portions to possibly eliminate. Rough cost estimates for both transition and maintenance were calculated. In a nutshell, the deployed facility would be deployed via Unidata with Unidata partnering with another institution to provide expertise in the area of high performance computing. Specifically, LEAD team members from NCSA, a TeraGrid facility, are interested in and capable of being the partner with Unidata.

Subsequently, the LEAD PIs held a meeting to develop a vision for LEAD-2 and plan for the NSF visit. At this meeting, a rounded picture of LEAD deployment and computer science research was developed.

LEAD PIs presented their LEAD-2 vision to NSF on Thursday, August 30 in Arlington.  As part of the visit, LEAD Project Director, Prof. Kelvin Droegemeier, University of Oklahoma, gave a NSF-wide seminar on LEAD, discussing progress and results to-date of the project.  The seminar included presentations by a student at University of Illinois (a LEAD alumnus from Millersville University) and another students from Howard University.  Several NSF officials from ATM, OCI, and CISE attended the talk. Following the seminar, a meeting to discuss the PIs vision and plans for LEAD-2 was held.  NSF officials present at either of these meetings were: Cliff Jacobs (ATM), Steve Nelson (ATM), Steve Meacham (OCI), Dan Atkins (OCI), Chris Geer (OCI), Frederica Darema (CISE), and Sylvia Spengler (CISE). Other LEAD team members participated via Access Grid.

It was mentioned that NSF views LEAD as one of the most successful ITR projects and LEAD remains a key Science Gateway Project in the TeraGrid development and evolution. NSF officials encouraged LEAD PIs to continue to the dialog with NSF on potential avenues for funding LEAD-2.  Seeking CISE funding for LEAD computer science research is believed to be appropriate. The CISE program has an established funding cycle that LEAD would follow. NSF officials encouraged LEAD PIs to consider multi-agency funding, with sponsors outside the NSF.  The PIs were also asked to continue the dialog individually with various NSF program officers, including solicitation of advice regarding submission of a proposal to fund various elements of LEAD-2. The PIs also plan to beginning writing a proposal to the CDI solicitation, which is expected to be announced by October 2007. At the same time, LEAD is planning an effort to reach out to the WRF community in a more organized fashion to begin building a base of support for deployment beyond the current base of undergraduate students and educators.

 

The THREDDS Data Repository (TDR)

To recall, a motivating use case for the development of the TDR is LEAD, where users have a need to create personal archives of data generated from their experiments. The TDR allows clients (users or programs) to upload data and metadata to a THREDDS Data Server (TDS). The TDR now provides:

The TDR is currently being hardened to handle multiple simultaneous clients. Also, the human user interface will be upgraded this fall. In addition to LEAD, the TDR technology is being leveraged in the Next Generation Case Studies(NGCS) Project. For more information about the NGCS project, please see the NGCS Status Report.

 

Bringing Radar Data into LEAD

Due to its very large volume (coupled with some early design decisions), radar data has not been available to the the LEAD system. LEAD team members are currently working to address this unavailability.

Data is brought into LEAD by running the crosswalk code that generates LEAD metadata from THREDDS metadata (which describes data delivered by the IDD and other sources and stored on the LEAD test bed systems) and storing the result in the LEAD Data Catalog. The Data Catalog currently recreates all database indices from scratch every time the crosswalk is run.

The strategy under development is to catalog the most recent day's worth of level II radar data and also level III data on a per product basis. The Data Catalog is to be reengineered to maintain state information so that metadata about a previous day's radar data will continue to be available as long as the data is estimated to remain on the disk.

LEAD Year Five Plans

The LEAD team has been refining and prioritizing our goals for year five of the project. These goals include significant expansion of operational support as well as significant developmental goals.

One of the goals had been to expand our support of the WxChallenge by inviting all 65 participating Universities to make us of the LEAD portal for their forecasting in Spring 2008. Two of the key lessons learned during the first phase that provided support for 10 universities participating in WxChallenge in Spring 2007 were that the reliability LEAD workflows were not as high as we'd have liked and that supporting the end users was a resource intensive task. The team has concluded that by spending our limited resources focused more on enhancing the reliability and robustness of the LEAD system during the Fall, we will be more likely to achieve the developmental and reliability goals we've established. Thus the team has decided to forgo supporting the Fall 2007 WxChallenge and instead will provide such support in the Spring 2008. To that end, a release of LEAD capabilities that include a fault tolerant service has occurred and the capabilities are now being tested by LEAD participants, as well as students at both Millersville and Howard Universities.

The difficulties of LEAD reliability have been significantly tied to the underlying TeraGrid infrastructure. As it turns out, LEAD is one of few projects that exercise the TeraGrid in a true "grid" fashion and thus we have uncovered weaknesses that had otherwise gone unnoticed. The TeraGrid developers have been very receptive to the LEAD team with respect to this message and have been working closely with LEAD developers to address these areas. This work will go a long way to improving overall LEAD workflow reliability.

Recognizing that support of undergraduate researchers and educators is only part of the LEAD promise, the team has also prioritized goals that will facilitate the needs of researchers at the graduate level in meteorology and beyond. These include the ability to edit namelist input files that are used in LEAD workflows and the ability for end users to wrap their executable capabilities as services that can be incorporated into LEAd workflows. There is strongly held sentiment that without these capabilities, the community will not support the idea of a LEAD facility beyond year five.

The Unidata LEAD Test Bed Status

The Unidata LEAD testbed continues to be a primary resource of data for LEAD workflows. This includes:

We are also exploring the addition of 12km Grib2 formatted NAM data for initial and boundary conditions for workflows. At a minimum we will need to support Grib2 form by January.

 

LEAD at the Oklahoma Unidata Regional Workshop

As part of the Unidata Regional Workshop, LEAD was both presented to and used by workshop participants. Most of the participants in the workshop submitted two WRF workflows to the LEAD system using the LEAD Portal (http://portal.leadproject.org). Only about 60% of the workflows completed successfully, though most participants did have a successful run whose results they viewed using the IDV. Nearly all of the failed workflows were owing to underlying TeraGrid failures (in most of these cases a gridFTP problem). The LEAD team continues to work with our TeraGrid partners to diagnose and resolve these types of problems.

Despite the lower percentage of successful workflows, the workshop participants were generally positive about the capabilities that they were seeing. One participant indicated that he could see using LEAD in his classroom right away and inquired about doing so.