The Big Data Project (BDP) is an initiative undertaken by the National Oceanic and Atmospheric Administration (NOAA) to increase public availability of large volumes of environmental data collected and generated by the agency. As part of the Big Data Project, Unidata is working in collaboration with Amazon Web Services (AWS) on a demonstration project to provide access to a more than twenty years of archived NEXRAD Level II radar data — augmented continuously with new, real-time data — stored in Amazon's Simple Storage Service (S3) environment. In addition to assisting AWS with ingesting new data flowing from the NEXRAD sites, Unidata Program Center staff have set up a THREDDS Data Server in the AWS environment to provide services allowing community access to the stored data.
About the Big Data Project
According to NOAA's BDP web page, “the Big Data Project is an innovative approach to publishing NOAA's vast data resources and positioning them near cost-efficient high performance computing, analytic, and storage services provided by the private sector.” In practice, this means that NOAA is making selected data assets available for five “Infrastructure as a Service” (IaaS) providers to upload to their cloud systems if they choose: Amazon Web Services (AWS), Google, IBM, Microsoft, and the Open Cloud Consortium. NOAA will continue to provide public access to the data via its traditional mechanisms as well.
What Data are Available
The project data collection consists of NEXRAD Level II radar data collected between 1991-2015, stored at NOAA's National Centers for Environmental Information (formerly the National Climatic Data Center). The data set consists of more than 250 TB of compressed data (1 Petabyte uncompressed), approximately half of which was stored on magnetic tape. The complete archive is now available on AWS; transfers to some of the other IaaS providers are still in progress.
In addition to the archive data, new Level II data are being added to the collection in near real time. NEXRAD Level II scans are performed continuously at 160 radar sites in North America. At each radar site, as each “chunk” (100 radial degrees, 1 tilt) of a scan is completed, the data is distributed in via Unidata's Local Data Manager (LDM) technology to subscribing sites. As part of this project, the individual chunks are delivered to AWS and stored temporarily in an S3 bucket, awaiting the remaining chunks that comprise the full 3-dimensional volume scan. Once all of the chunks that make up one scan are determined to be present, the chunks are combined into an aggregate volume dataset and stored permanently in the collection S3 bucket.
Accessing the Data via TDS
Members of Unidata's university community can access the
collection via this THREDDS Data Server:
http://thredds-aws.unidata.ucar.edu/thredds/catalog.html
(To connect using the IDV, substitute
catalog.xml
for
catalog.html
when entering the URL in the Data Chooser.)
We encourage community members to experiment with accessing
the collection via the TDS. Note, however, that because this is
a demonstration project, we cannot guarantee long-term access
to the server. Similarly, because Unidata has limited resources
available for this demonstration, access to this particular
TDS is restricted to connections from
.edu
domains.
Accessing the Data in the Amazon S3 Environment
For those comfortable with the AWS environment, access to the
collection S3 bucket is unrestricted. If you have an appropriate
client, you can connect to the S3 bucket using this URL:
http://noaa-nexrad-level2.s3.amazonaws.com/
Inside the S3 bucket, data are stored in the following format described
in this document.
Those who can create an AWS EC2 instance in the US East AWS zone can mount the archive S3 bucket directly as described in the Amazon EC2 documentation for S3.
Additionally, those who are interested in the fastest access
to the chunked data before it is aggregated into a 3D volume
scan can connect to this URL:
http://unidata-nexrad-level2-chunks.s3.amazonaws.com/
or mount the temporary S3 bucket directly as described
in the Amazon
EC2 documentation for S3. Inside the S3 bucket, data are
stored in the following format described in this document. Note that
the chunked radar data only persists in this S3 bucket for a
maximum of 24 hours before being scrubbed.
Unidata community members who run into issues accessing the AWS NEXRAD archive are encouraged to contact Unidata support for assistance. Additional details regarding this AWS Public Data Set, including links to several tutorials on accessing the data, are available in this post on the Amazon Web Services blog.
Access using Python
Unidata developer Ryan May has created a Jupyter (formerly iPython) notebook to demonstrate how to access the THREDDS Data Server (TDS) instance that is serving up archived NEXRAD Level II data hosted on Amazon S3. Check out Using Python to Access NCEI Archived NEXRAD Level 2 Data for details.
Something is badly broken. Once you "visit" a year, you can't return to it later. The "icon 2006/" becomes just "2006". Clicking on it returns you to a download page.
Anything that something under "Last Modified", including "index.html" has this problem.
OS is Linux/CentOS
Browser is FireFox
Posted by Kevin Thomas on October 28, 2015 at 09:32 AM MDT #
Thanks for the report! There was a bug THREDDS' S3 code, which is now resolved. Please let us know if you find any more problems.
Posted by Ryan May on October 28, 2015 at 11:28 AM MDT #
Will others beyond the Members of Unidata's university community be able to access the collection via this THREDDS Data Server?
Posted by George Percivall on October 29, 2015 at 02:19 AM MDT #
Short answer: not in the near term.
The initial configuration of the THREDDS Data Server in the AWS cloud does not require the data user to have an AWS account. As a result, charges for data egress via the TDS are billed to Unidata's AWS account. While Amazon has generously provided Unidata with account credit to cover these costs during the demonstration, Unidata does not currently have funding to cover the data retrieval costs in a general way.
There are several possibilities for changes to this situation in the future. One would be to develop a mechanism whereby data users cover their own data retrieval costs. Alternately, Unidata or some other organization could secure funding to provide an open-access TDS server for the NEXRAD data without direct charges to the data user. From Unidata's perspective, an important part of this demonstration project is an investigation into usage patterns within our core university community and the economics of providing public access to data in a commercial cloud environment.
Posted by Unidatanews on October 29, 2015 at 02:40 AM MDT #
Ryan...
I can no longer replicate the problem. Thanks for the quick fix!
Posted by Kevin Thomas on November 03, 2015 at 07:43 AM MST #
There are several possibilities for changes to this situation in the future.
Posted by alfalah12345 on December 28, 2015 at 08:44 PM MST #
In this pdf, it says everything about how to get data, but does not specify just where to find the values for the variables.
For example, to get a 3D volume scan you go here: http://noaa-nexrad-level2.s3.amazonaws.com/Year/Month/Day/NEXRAD Station/filename
All values are self-explanatory except filename - is the name of the file containing the data Where do you get the filename at? Without it, you get this:
<Error>
<Code>NoSuchKey</Code>
<Message>The specified key does not exist.</Message>
<Key>2016/08/17/KRLX/</Key>
<RequestId>1DA700E4BF35F5F7</RequestId>
<HostId>
LapWfi97TaSiF6SWVlL5bc0JqTwVeY/miTJjlq023AqkqeUxuZT95u8qvJHpMoihICkJclw0Y50=
</HostId>
</Error>
Same question goes for the chunk files where do you get the values for:
YYYYMMDD is the date of the volume scan
HHMMSS is the time of the volume scan
CHUNKNUM is the chunk number
CHUNKTYPE is the chunk type
Thanks!
Posted by Jonathan on August 17, 2016 at 06:26 PM MDT #
Hi Jonathan,
Yes, that would be quite difficult to know in advance the chunk numbers, chunktypes, or seconds of the volume scan :)
AWS has provided a very nice service to browse the bucket and download files. the NEXRAD files are stored in year/month/day directories by station and are easily navigated using the tool found here:
https://s3.amazonaws.com/noaa-nexrad-level2/index.html
or if from a .edu domain, you can use our THREDDS server at:
http://thredds-aws.unidata.ucar.edu/thredds/catalog.html (.xml for clients)
I hope this makes accessing the NEXRAD files easier for you.
Jeff
Posted by Jeff Weber on August 19, 2016 at 04:42 AM MDT #
Hi Jeff,
I did stumble upon that; however, I was hoping to set up a real-time feed to the latest data and I'm not on a .edu domain. I can't find anywhere that spits out this information programmatically, for example, accessing via a python script.
Posted by Jonathan on August 22, 2016 at 12:58 AM MDT #
Will others beyond the Members of Unidata's university community be able to access the collection via this THREDDS Data Server?
Posted by enterbuy.com.vn on April 09, 2017 at 08:12 PM MDT #
Access Unidata's THREDDS Data Server is limited to .edu addresses. Those outside the university community are able to access the Amazon S3 storage directly.
Posted by unidatanews on April 10, 2017 at 03:31 AM MDT #
Yes, that would be quite difficult to know in advance the chunk numbers, chunktypes, or seconds of the volume scan :)
Posted by phong kham thai ha on August 07, 2017 at 04:05 PM MDT #
I can no longer replicate the problem. Thanks for the quick fix!
Posted by hình ảnh mụn cóc sinh dục on August 24, 2017 at 03:04 PM MDT #
I did stumble upon that; however, I was hoping to set up a real-time feed to the latest data and I'm not on a .edu domain. I can't find anywhere that spits out this information programmatically, for example, accessing via a python script.
Posted by khám phụ khoa on September 25, 2017 at 07:16 PM MDT #
Finding a list of files is a pain, but you can do it programmatically via the Open Science Data Cloud (https://www.opensciencedatacloud.org/publicdata/noaa-nexrad-l2/). That page provides a simple html interface; if you examine the requests it makes you can use the underlying api to find files.
Posted by Nat Kale on October 11, 2017 at 06:08 AM MDT #
Good to know that outsiders of the university community can access it directly from Amazon S3 storage,
Posted by cio whitepaper on January 11, 2018 at 07:05 PM MST #
If you want to have programmatic access to new NEXRAD data, the simple method is to subscribe to SNS notification. Each time either a chunk or archive of data lands in Amazon S3. SNS will send you the S3 url. Search for SNS here https://aws.amazon.com/public-datasets/nexrad/ You will however need to have AWS account to do this.
Posted by Mark Korver on February 21, 2018 at 10:01 AM MST #
I did stumble upon that; however, I was hoping to set up a real-time feed to the latest data and I'm not on a .edu domain. I can't find anywhere that spits out this information programmatically, for example, accessing via a python script.
Posted by phong kham phu khoa on July 03, 2018 at 08:40 PM MDT #