This is the first of what we hope will be a series of posts showing how to use Python for weather analysis and create graphics for a variety of purposes. In this two-part post, we demonstrate plotting a water vapor satellite image, specifically using GINI formatted data. GINI is the format currently used for satellite data transmitted across NOAAPORT, and is available on Unidata's demonstration THREDDS server. This first part focuses on accessing the data using Siphon and MetPy; the second part will introduce plotting using CartoPy. To whet your appetite, though, here is a sample of what we'd like to produce:
If you'd like to follow along at home, this post is available as a Jupyter notebook. The README file there has instructions on how to set up a Python environment and run the notebooks we'll be creating for these and future posts. (Note: This requires at least metpy 0.3 for the GINI functionality.)
Getting the data file
The first step is to find the satellite data. If we browse over to http://thredds.ucar.edu/thredds/, we're presented with the top-level TDS catalog; this has a helpful looking link that says "Satellite Data". Navigating here, we see a large collection of different satellite datasets. We're interested in looking at water vapor imagery, so the "Water Vapor (6.5 / 5.7 um)" link seems most useful. We'll dig further in by navigating into "EAST-CONUS_4km" and then "current", so that we can look at current full CONUS (CONtiguous US) images from GOES East.
Here, we find a large listing of individual files. We could manually download a file and open it, but:
- That's no fun whatsoever
- It means we'd have to go through the same manual process to get data tomorrow
Instead, Python to the rescue! We can use Unidata's Siphon package to parse the catalog from the TDS; this provides us a nice programmatic way of accesssing the data. So we start by importing the TDSCatalog
class from siphon and giving it the URL to the catalog we just surfed to manually.
Note: Instead of giving it the link to the HTML catalog, we change the extension to XML, which asks the TDS for the XML version of the catalog. This is much better to work with in code.
from siphon.catalog import TDSCatalog
cat = TDSCatalog('http://thredds.ucar.edu/thredds/catalog/satellite/WV/EAST-CONUS_4km/current/catalog.xml')
From this TDSCatalog
object we now have (cat
), and we want to get the latest file. To find the latest file, we can look at the cat.datasets
attribute. This is a Python dictionary, mapping the name of the dataset to a Python Dataset object (which came from more XML supplied by the TDS — notice a theme?) Since this is a dictionary, we can look at a list of the keys (or actually, just the first 5):
list(cat.datasets)[:5]
['EAST-CONUS_4km_WV_20160128_0530.gini', 'EAST-CONUS_4km_WV_20160128_1930.gini', 'EAST-CONUS_4km_WV_20160128_1715.gini', 'EAST-CONUS_4km_WV_20160128_2015.gini', 'EAST-CONUS_4km_WV_20160128_0115.gini']While we have the names of some GINI files, they're all jumbled and not in any kind of order. Fortunately, they're named appropriately so that sorting the names will yield a list in chronological order:
sorted(cat.datasets)[:5]
['EAST-CONUS_4km_WV_20160127_1845.gini', 'EAST-CONUS_4km_WV_20160127_1900.gini', 'EAST-CONUS_4km_WV_20160127_1915.gini', 'EAST-CONUS_4km_WV_20160127_1930.gini', 'EAST-CONUS_4km_WV_20160127_1945.gini']Much better. Now what we really want is the most recent file, which will be the last one in the list. We can pull that out, and use its name to get the actual Python Dataset object:
dataset_name = sorted(cat.datasets.keys())[-1]
dataset = cat.datasets[dataset_name]
print(dataset)
<siphon.catalog.Dataset object at 0x10810a240>
The catalog.Dataset
class provides access to a lot of information about a dataset, like metadata (e.g. time range, spatial extent). What we want most however, is to know how to access the data. This is provided by the dataset.access_urls
attribute:
dataset.access_urls
{'CdmRemote': 'http://thredds.ucar.edu/thredds/cdmremote/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'HTTPServer': 'http://thredds.ucar.edu/thredds/fileServer/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'ISO': 'http://thredds.ucar.edu/thredds/iso/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'NCML': 'http://thredds.ucar.edu/thredds/ncml/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'NetcdfSubset': 'http://thredds.ucar.edu/thredds/ncss/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'OPENDAP': 'http://thredds.ucar.edu/thredds/dodsC/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'UDDC': 'http://thredds.ucar.edu/thredds/uddc/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'WCS': 'http://thredds.ucar.edu/thredds/wcs/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini',
'WMS': 'http://thredds.ucar.edu/thredds/wms/satellite/WV/EAST-CONUS_4km/current/EAST-CONUS_4km_WV_20160128_2115.gini'}
These different urls provide access to the data in different ways; some support different protocols (like OPeNDAP or CDMRemote), others allow harvesting metadata (e.g. ISO). We're going to start simple, so we want to use the HTTPServer
method, which allows downloading the datafile using HTTP. (Other data access methods will be the subject of future posts.) We can take this URL and pass it to the urlopen
function from the urllib.request
module in Python's standard library. This gives us a Python file-like object, which for the most part we can treat just like a file we opened locally.
from urllib.request import urlopen
remote_gini_file = urlopen(dataset.access_urls['HTTPServer'])
Parsing the data
Now that we have this file-like object, we could certainly read()
data and parse it by hand or save to disk, but that's too much work. Instead, we will use MetPy's support for reading GINI files by importing the GiniFile
class, and passing it the file-like object:
from metpy.io.gini import GiniFile
gini = GiniFile(remote_gini_file)
print(gini)
GiniFile: GOES-13 East CONUS WV (6.5/6.7 micron) Time: 2016-01-28 21:15:18 Size: 1280x1280 Projection: lambert_conformal Lower Left Corner (Lon, Lat): (-113.1333, 16.3691) Resolution: 4km
GiniFile
was able to successfully parse the data and we see (as expected) that we have a 4km CONUS water vapor image from GOES-13 East. While GiniFile
itself provides a low-level interface to all the information in the file (useful if checking to see if the file was parsed correctly), we don't need the low level details. Fortunately, the to_dataset()
method can convert the data into a form that resembles the Dataset
object from netCDF4-python.
gini_ds = gini.to_dataset()
print(gini_ds)
root Dimensions: <class 'metpy.io.cdm.Dimension'>: name = time, size = 1 <class 'metpy.io.cdm.Dimension'>: name = x, size = 1280 <class 'metpy.io.cdm.Dimension'>: name = y, size = 1280 Variables: <class 'metpy.io.cdm.Variable'>: int32 time(time) units: milliseconds since 2016-01-28T00:00:00 shape = 1 <class 'metpy.io.cdm.Variable'>: int32 Lambert_Conformal() grid_mapping_name: lambert_conformal_conic standard_parallel: 25.0 longitude_of_central_meridian: -95.0 latitude_of_projection_origin: 25.0 earth_radius: 6371200.0 <class 'metpy.io.cdm.Variable'>: float64 x(x) units: m long_name: x coordinate of projection standard_name: projection_x_coordinate shape = 1280 <class 'metpy.io.cdm.Variable'>: float64 y(y) units: m long_name: y coordinate of projection standard_name: projection_y_coordinate shape = 1280 <class 'metpy.io.cdm.Variable'>: float64 lon(y, x) long_name: longitude units: degrees_east shape = (1280, 1280) <class 'metpy.io.cdm.Variable'>: float64 lat(y, x) long_name: latitude units: degrees_north shape = (1280, 1280) <class 'metpy.io.cdm.Variable'>: uint8 WV(y, x) long_name: WV (6.5/6.7 micron) missing_value: 255 coordinates: y x grid_mapping: Lambert_Conformal shape = (1280, 1280) Attributes: satellite: GOES-13 sector: East CONUS
Or, we can take the gini_ds.variables
attribute, which is a dictionary, and look at the keys()
to see what's in the file:
list(gini_ds.variables.keys())
['time', 'Lambert_Conformal', 'x', 'y', 'lon', 'lat', 'WV']From here, we can pull out the 'WV' variable, which contains the data we want:
print(gini_ds.variables['WV'])
<class 'metpy.io.cdm.Variable'>: uint8 WV(y, x) long_name: WV (6.5/6.7 micron) missing_value: 255 coordinates: y x grid_mapping: Lambert_Conformal shape = (1280, 1280)
We'll stop here for now (to keep this in an easily-digestible chunk). Part 2 will cover plotting this imagery, including pulling all the needed information from the file and setting up CartoPy (and its projections).
Additional resources
- MetPy Documentation
- Siphon Documentation
- Unidata Blog Notebooks (View Here)
- Notebooks from Unidata's Annual Python Training Workshop
Was this too much detail? Too slow? Just right? Do you have suggestions on other topics or examples we should cover? Do you have a notebook you would like us to show off? We'd love to have your feedback. You can send a message to the (python-users AT unidata.ucar.edu) mailing list or send a message to support-python AT unidata.ucar.edu. You can also leave a comment below, directly on the blog post.
thank youfor sharing this post..love it.!
Posted by مبلمان on April 29, 2016 at 06:19 PM MDT #
i like your site
Posted by اتاق کودک on June 19, 2016 at 05:42 PM MDT #
The above blog is really much more informative and helpful to all people...Thanks for sharing these types of informative updates...
Posted by saranya on March 15, 2017 at 01:00 AM MDT #
Hi.
I just like the helpful info you provide to your articles.
Posted by سرویس خواب on August 12, 2017 at 08:27 AM MDT #
thank you for sharing this article
Posted by لوستر on August 14, 2017 at 11:53 PM MDT #