LEAD at Unidata
Status Update May 3rd, 2007
Overview
Unidata Policy Committee Meeting
During the Unidata Policy Committee Meeting held March 12-13, 2007, a period of time was set aside for consideration of the future of the LEAD project and Unidata's role in that future. Mohan and Kelvin Droegemeir (overall LEAD PI) gave a presentation and the committee conducted extensive discussion of LEAD and its future. The Policy Committee arrived at the following resolution:
Resolution: The Policy Committee recognizes the strategic long term value of the LEAD project for the Unidata community and encourages the UPC to take a key role in coordinating the development of proposal(s) for evolution of LEAD, to an extent that will not compromise the core activities and resources.
In the time since the Policy Committee meeting, the LEAD PIs have started working on a plan for sustaining the LEAD project beyond the ITR phase.
Near Term LEAD Goals
The first goal was to provide support for the WxChallenge Collegiate Forecast Contest, a collegiate weather forecasting competition. This goal is now complete. The LEAD team provided mesoscale forecast capabilities to WxChallenge participants at 10 institutions, 8 of which are outside of the LEAD umbrella. During the nearly 90 day period LEAD was involved, 70 approved participants submitted 1232 mesoscale forecast runs.
The second goal is to support the CAPS Spring Experiment which itself has three primary thrusts. The first thrust involves launching ensemble forecasts to study areas of deep convection. Ensemble forecasts allow for specifying uncertainty in model initial conditions and quantifying uncertainty in model output. The second thrust provides for dynamic forecasts that are triggered by the receipt of tornado watches and warnings. Finally, using the LEAD portal, forecasters will be able to determine domains for and launch forecasts on demand.
Beta Users
Program
The Beta Users program has been expanded beyond the internal LEAD
team to include groups of students from Millersville and Howard
universities as well as Universities participating in the WxChallenge. Using our existing infrastructure, we have set up a LEAD support venue
(support@unidata.ucar.edu) and user's e-mail list (leadusers@unidata.ucar.edu)
both of which have seen significant traffic. Thus far, over 100 support questions
have been received and answered over a 3 month period resulting in many bugs and
feature requests brought to the LEAD team. Unidata plays the role of
helping users as well as filtering and vetting bug reports.
Unidata LEAD Test Bed
Status
We continue to maintain a rolling archive of at least 120 days of each of the seven LEAD canonical datasets, and in some cases more please see Data Description for LEAD 7 Datasets. The archive also has at least 120 days of the remaining IDD feeds. In addition we are maintaining a smaller archive of ADAS and steered WRF model output. We did have the unexpected failure of 2 disks in one of our RAID systems that resulted in the loss of some of the datasets. We recovered what we could, identified the datasets for which this is unacceptable to LEAD and are working on a secondary setup of those key datasets.
There are two TDS catalogs that provide access to the data. The primary catalog provides complete access to all the IDD data. The other catalog is the operational LEAD top catalog. This catalog does not yet include radar data as the volumes are too great for the current LEAD software and require some strategy for handling. This is being worked on by the LEAD team.
TDR
TDR Use Cases
At this time TDR development is being steered by two use cases. The Next Generation Case Study project (NGCS) is a case study repository in which archive designers interactively arrange and store items related to a case study, such as data, notes, images, IDV bundles, etc., and make these studies available to their community. Also, the LEAD project needs storage and access for items relevant to a user's experiment. The latter includes items involved in running an orchestration, such as input, output, and intermediate files, but also includes items that a user wants to publish.
These use cases have in common the need for a repository space that: provides data storage, can be structured by the client, provides integrated metadata management, and can serve the data.
The
In the Unidata TDR deployment, the repository is subdivided on a per project basis, currently the Next Generation Case Study project and the LEAD project. Each project has a separate partition in the repository space. Each project also has a different front end to the repository in the form of a servlet interface. Within a project, clients can store data and create catalogs in a hierarchically structured manner of their own design. A client can add or remove nodes within their space.
The NGCS project requires an interactive interface. Users communicate with the server via a web input form. This form provides a means to specify a data source, enter metadata, and also provide information about structuring the storage space. Users can add or delete nodes in this space via this input form. Once stored, the data is browsable and retrievable via the TDS.
The LEAD orchestration system is based on a Service Oriented Architecture (SOA). Thus the LEAD interface to the TDR must provide a Web API. We currently have a simple HTTP-based client that can communicate with the server and provide the same inputs as the web form.
Functionality was added for storage of collections (currently in the form of a tar.gz file). Collections are unpacked and an associated catalog structure is generated. This allows movement of groups of files into the repository via a single operation.
A prototype is available on the LEAD test bed. Tomcat authentication for using the web input form has been implemented, thus limiting users who can upload to the server.
TDR Next Tasks
Crosswalk
Discussions have occurred regarding upgrading of the crosswalk in order to handle large data volumes such as radar data. A simple data and host specific solution has been outlined that will provide updates to a continuously maintained list of available radar data. The development of this software could provide a relatively quick solution to the problem of integrating radar data into LEAD. However, as the NetCDF Subset Service matures, it would likely provide a better, more general solution with a uniform interface for a variety of dataset types.
The LEAD team held it semiannual All Hands meeting in