Data Management Resource Center

Data Management Resource Center

How Data Management Helps You

Data

Scientific activities have always relied heavily on data collected from the environment to inform our ideas about how natural processes work. More recently, computer simulations of natural processes has given us powerful tools to enhance our understanding of observational data and make testable predictions. While analysis of our rapidly expanding store of observations and simulations has brought many new insights, it also brings new complexities to the process of scientific investigation, namely: how to manage the wide variety of data used to “do science” and ensure that those data are available to and usable by others who want to test the new theories that arise?

In this online resource center, Unidata hopes to provide you with information about evolving data management requirements, techniques, and tools. While we cannot address the specific requirements of every funding agency or evaluate every tool, we can walk you through the common requirements and (with luck) make it easier for you to fulfill them. In addition, we hope to explain how to use some common tools to build a scientific data managment workflow that not only makes your life as an investigator easier, but also enhances access to your work.

Note: This resource center is still in its first iteration, and is continuing to evolve. Some sections may appear unfinished. If you have suggestions or other other feedback, please direct them to the Data Management Resource Center team. If you have a resource to suggest for potential inclusion in the resource center, you can use this suggestion form.

Why Spend Time Managing Data?

Planning and implementing an effective data collection, storage, sharing, and archiving strategy takes energy and effort. Even so, we believe that the benefits — to your project and to the scientific community in general — outweigh the costs, for the following reasons:

  • Storing data in standard formats allows you, your collaborators, and anyone who wants to study your results to take advantage of a variety of widely-used software tools. Self-documenting formats like netCDF or HDF have the added advantage of storing your project metadata alonside the data, helping to ensure the data stay useful in the future.
  • Robust, network-based data access mechanisms like RAMADDA and the THREDDS Data Server make it easier for geographically separated project groups to use project data to collaborate.
  • Remote data access facilities also allow you to publish your data widely while retaining control. In turn, reliable data access allows others to cite your data as well as your published findings, adding to your publication credits.
  • Many funding agencies now require that you submit a data managment plan in order to receive funding. While agency requirements are not particularly stringent today, we expect that the demands on investigators will increase over time. Adopting a robust data management workflow now will position you for future demands.
  • It helps keep your research efforts from ending up as “dark data,” ensuring that they stay visible to scientists and other potential users into the future.

Setting up a robust data management workflow at the beginning of your project allows you to focus on doing science, not on managing data.

Funding Agency Requirements for Data Management

Many U.S. Government funding agencies now require that a Data Management Plan be submitted along with all applications for funding. While there are too many distinct agencies with specific Data Management Plan requirements to cover here, we will describe three that are quite likely to be important for members of the Unidata community here: the National Science Foundation (NSF), the National Oceanic and Atmospheric Administration (NOAA), and the National Aeronautics and Space Administration (NASA).

We aspire to keep the information in this section current, but you should always consult the lastest materials available from the agency to which you are applying before completing your Data Management Plan.

Unidata-Supported Tools for Data Management

Unidata supports a variety of tools that can assist you in managing your project data. While these tools cannot solve every problem for every project, they can help you create well-documented data that is easy for your project team to use, useful to other researchers, and accessible to those who need access.

netcdf rosetta ramadda tds ldm store transform metadata move share

Resources for Creating Data Management Plans

The sections below collect information pertaining to the creation of data management plans for projects funded by the agencies of greatest interest to Unidata's core community members. The list of resources here is not exhaustive, and you should consult your program officer or other agency contact for specific details relevant to your proposal. For each agency, we provide a sample data management plan to give you some ideas about how to take advantage of Unidata technologies to manage your project data and meet agency requirements.

In addition, we provide information about some third-party resources for data management planning, as well as information on options for archival storage of project data.