Rosetta

Status Report: March 2014 - September 2014

Sean Arms, Jen Oxelson, Jeff Weber

Strategic Focus Areas

The Rosetta group's work supports the following Unidata funding proposal focus areas:

  1. Enable widespread, efficient access to geoscience data

    The initial goal of Rosetta is to transform unstructured ASCII data files into the netCDF format; once in this format, standard tools, such as the THREDDS Data Server, IDV, Python, and other analysis packages, can take advantage of these datasets with relative ease.

  2. Develop and provide open-source tools for effective use of geoscience data

    Although the primary goal of Rosetta is to get data into the netCDF format, the transformation process does not stop there. The Rosetta group realizes that not everyone knows how to work with netCDF files, and may feel more comfortable working with other formats. Therefore, Rosetta includes the ability to transform from one format to another (e.g. netCDF to .xls), thereby reducing data friction.

  3. Provide cyberinfrastructure leadership in data discovery, access, and use

    Metadata contained in netCDF format file (no longer locked away in a separate README file) can be automatically extracted, facilitating the discovery of data in these files. Additionally, the Rosetta development plan includes the creation of a standard ASCII and spreadsheet representations of the CF-1.6 DSGs.

  4. Build, support, and advocate for the diverse geoscience community

    Promote the use of standard formats in the dissemination of data, while allowing flexibility to transform into other formats, as needed, to enable users to "do science". For commonly used formats, such as User Defined ASCII format or an unstructured spreadsheet, create and advocate for the use of a standard representations based on the CF-1.6 DSGs.

Grumpy Cat is Grumpy

Activities Since the Last Status Report

Live demos to various groups

AMS 2014 Presentation

Arms, S. C., J. O. Ganter, J. Weber, and M. K. Ramamurthy, 2014: Rosetta - Unidata’s Web-based Translation Tool: Progress and Future Plans. 30th Conference on Environmental Information Processing Technologies, 94th AMS Annual Meeting, Atlanta, GA, A.84. Available online at https://ams.confex.com/ams/94Annual/webprogram/Paper240011.html

Basic Documentation

Transitioned to using Doxygen for user and developer documentation:

http://www.unidata.ucar.edu/software/rosetta/dox/html/index.html

Accomplishments of Note

  • Added the ability to publish converted files directly to RAMADDA and the ACADIS Gateway
  • Live instance of Rosetta hosted at Unidata for testing
  • Released the Rosetta source code on github
  • Transitioned to Doxygen for documentation

Planned Activities

Ongoing Activities

We plan to continue the following lines of development:

  • Increase the number of CF-1.6 discrete sampling geometries handled by Rosetta
  • Begin collecting metrics for the instance of Rosetta hosted at Unidata
  • Continue documentation efforts, including the creation of screencasts for User documentation
  • Solicit examples from the community (hint, hint...that's you guys!)

New Activities

We plan to enhance Rosetta in the following ways:

  • Investigate csv and xls(x) representations of the CF-1.6 Discrete Sampling Geometries
  • Enable Desktop (local) use of Rosetta
  • Anyone who sees this and comments on it will get a free cookie, cheese and crackers, veggies, or fruit, to be delivered at the time of the meeting
  • Incorporate TDS capabilities into Rosetta, allowing for TDS services (like point subsetting of grids) to easily be applied to local files
  • Create infrastructure to collect use metrics for Rosetta
  • We would like to move the Rosetta code into the THREDDS codebase. One reason is to keep the Rosetta code in lock-step with developments in the CDM (netCDF-Java) library with regards to Point Data. The second reason is that in addition to the ability to publish converted files into RAMADDA and the ACADIS Data Repository, we would like to enable Rosetta to publish files into THREDDS Data Servers (TDS). Not only would users be able to publish files into a TDS, Rosetta would also provide a way to customize THREDDS Catalogs for their datasets in a user friendly way.

We would love your input as to where our priorities should be in terms of these New Activities. Let's chat! And, yes, please...send example ASCII data :-) K THX BAI!

Relevant Metrics

We've received a handful of support questions regarding the availability of Rosetta, as well as requests for demonstrations.