AWIPS II at Unidata :: AWIPS II Manual

Chapter 1 - AWIPS II System Architecture

1.1 AWIPS II Communication Architecture

AWIPS II takes a unified approach to data ingest; most data types follow a standard path through the system. Though operational forecast offices have variations of this basic data flow, including local radar and LDAD delivered products, non-operational sites which already request IDD data via the Unidata LDM have the standard data delivery process for AWIPS II already in place.

At a high level, data flow describes the path taken by a piece of data from its source to its display by a client system. This path starts with data requested and stored by an LDM client and includes the decoding of the data and storing of decoded data in a form readable and displayable by the end user.

AWIPS II supports automated processing using a cron-like capability built into the EDEX process. EDEX also provides database and Processed Data Storage management; Raw data storage is managed by the Local Data Manager (LDM) process. Script running in response to data arrival is also supported within EDEX. These topics are covered in Chapters 11 and 12 of this manual.

The AWIPS II ingest and request processes are a highly distributed system; messaging is used for Inter-Process Communication (IPC). Figure 1.1-1 shows the basic IPC architecture for AWIPS II. There are four primary communication channels:

  1. AMQP messages routed between the various processes by Qpid.
  2. Data.
  3. Data requests.
  4. Product notification.

image

Figure 1.1-1. AWIPS II Inter-Process Communication

1.2 AWIPS II Software Components

The primary AWIPS II application for data ingest, processing, and storage is the Environmental Data EXchange (EDEX) server; the primary AWIPS II application for visualization/data manipulation is the Common AWIPS Visualization Environment (CAVE) client, which is typically installed on a workstation separate from other AWIPS II components.

In addition to programs developed specifically for AWIPS, AWIPS II uses several commercial off-the-shelf (COTS) and Free or Open Source software (FOSS) products to assist in its operation. The following components, working together and communicating, compose the entire AWIPS II system.

1.2.1 LDM

The LDM (Local Data Manager), developed and supported by Unidata, is a suite of client and server programs designed for data distribution, and is the fundamental component comprising the Unidata Internet Data Distribution (IDD) system. In AWIPS II, the LDM provides data feeds for grids, surface observations, upper-air profiles, satellite and radar imagery and various other meteorological datasets. The LDM writes data directly to file and alerts EDEX via Qpid when a file is available for processing.

http://www.unidata.ucar.edu/software/ldm/

1.2.2 Qpid

Apache Qpid, the Queue Processor Interface Daemon, is the messaging system used by AWIPS II to facilitate communication between services. When the LDM receives a data file to be processed, it employs edexBridge to send EDEX ingest servers a message via Qpid. When EDEX has finished decoding the file, it sends CAVE a message via Qpid that data are available for display or further processing.

Qpid is controlled by the system script /etc/rc.d/init.d/qpidd.

http://qpid.apache.org

1.2.3 EDEX Bridge

EDEX Bridge, invoked in the LDM configuration file ~/etc/ldmd.conf, is used by the LDM to post data available messaged to Qpid, which alerts the EDEX Ingest server(s) that a file is ready for processing.

1.2.4 EDEX

EDEX is the main server for AWIPS II. Qpid sends alerts to EDEX when data stored by the LDM is ready for processing. These Qpid messages include file header information which allows EDEX to determine the appropriate data decoder to use. The default ingest server (simply named ingest) handles all data ingest other than grib messages, which are processed by a separate ingestGrib server. After decoding, EDEX writes metadata to the database via Postgres and saves the processed data in HDF5 via PyPIES. A third EDEX server, request, feeds requested data to CAVE clients.

EDEX ingest and request servers are controlled by the system script /etc/rc.d/init.d/edex_camel

1.1.8 PostgreSQL

PostgreSQL, known simply as Postgres, is a relational database management system (DBMS) which handles the storage and retrieval of metadata, database tables and some decoded data. Postgres is controlled by the system script /etc/rc.d/init.d/edex_postgres.

http://www.postgresql.org

1.1.9 Metadata

Generally defined as "data about data", the AWIPS II system stores metadata alongside processed data to keep track of what data are currently available. Radar metadata, for example, contains information such as radar ID, volume scan time, elevation angle and product type.

The storage and reading of EDEX metadata is handled by the Postgres DBMS. Users may query the metadata tables by using the termainal-based front-end for Postgres called psql.

1.1.10 PyPIES

PyPIES, Python Process Isolated Enhanced Storage, was created for AWIPS II to isolate the management of HDF5 Processed Data Storage from the EDEX processes. PyPIES manages access, i.e., reads and writes, of data in the HDF5 files. In a sense, PyPIES provides functionality similar to a DBMS; all data being written to an HDF5 file is sent to PyPIES, and requests for data stored in HDF5 are processed by PyPIES.

PyPIES is implemented in two parts:

  1. The PyPIES manager is a Python application that runs as part of an Apache HTTP server, and handles requests to store and retrieve data.
  2. The PyPIES logger is a Python process that coordinates logging.

PyPIES is controlled by the system script /etc/rc.d/init.d/https-pypies.

1.2.8 HDF5

Hierarchical Data Format (v.5) is the primary data storage format used by AWIPS II for processed grids, satellite and radar imagery and other products. Similar to netCDF, developed and supported by Unidata, HDF5 supports multiple types of data within a single file. For example, a single HDF5 file of radar data may contain multiple volume scans of base reflectivity and base velocity as well as derived products such as composite reflectivity. The file may also contain data from multiple radars.

HDF5 is stored in /awips2/edex/data/hdf5/.

http://www.hdfgroup.org/HDF5/

1.2.9 Alertviz

Alertviz is a modernized version of an AWIPS I application, designed to present various notifications, error messages, and alarms to the user (forecaster). AlertViz can be executed either independently or from CAVE itself.

1.2.10 CAVE

CAVE, the Common AWIPS Visualization Environment, is the main data visualization and manipulation tool for AWIPS II. CAVE contains of a number of different data display configurations called perspectives. Perspectives used in operational forecasting environments include the following:

1.2.11 Quartz

Quartz is a job scheduler included with AWIPS II as a small java library. Quartz is responsible for regularly executing purging of processed data.

1.2.12 Software Endpoints

In AWIPS II, a software endpoint refers to the file endpoints that the various EDEX data decoders can use to read data. One way (though not necessarily the most preferred way) to feed data into EDEX is simply to have another program put a file in the proper file endpoint.

For example, putting a text file that contains METAR data in the endpoint directory of the obs decoder (/awips2/edex/data/sbn/metar) causes the obs decoder to execute using the input text file.

1.3 AWIPS II Software Deployment

1.3.1 Basic Standalone Server Deployment

A standalone AWIPS II configuration install all EDEX server component onto one server. The LDM and Raw and Processed Data Stores are also contained on this single machine, either as a separate disk partition or mounted as Network Attached Storage (NAS).

image

Figure 1.3-1. AWIPS II Standalone Configuration

1.3.2 Clutstered Server Deployment

For clustered server deployment, refer to section 2.8 of the Raytheon AWIPS II System Manager's Manual, "AWIPS II Server Clustering".

image

Figure 1.3-2. AWIPS II Clustered Configuration

1.4 AWIPS II Logging Overview

AWIPS II does not implement a standardized logging strategy; rather, each AWIPS II component produces logs that may be examined to determine the status of processes and assist in troubleshooting. Table 1.4-1 lists log locations for each AWIPS II component.

Process Log Directory
Postgres /awips2/data, /awips2/data/pg_log
PyPIES /awips2/pypies/logs, /awips2/http_pypies/var/log/httpd
LDM /usr/local/ldm/logs
scour /usr/local/ldm/logs
cron /var/logs
EDEX /awips2/edex/logs
Qpid /awips2/qpid/var/log
Alertviz ~/caveData/logs
CAVE ~/caveData/etc/user/${USER}/logs

Table 1.4-1. Log Locations for AWIPS II Components

1.5 AWIPS II Data Processing Architecture

AWIPS II separates data processing into four logically separate components:

  1. Data receipt
  2. Data decoding
  3. Data storage
  4. Data request

For operational use at forecast offices, these components are distributed across a number of systems (up to 11 dedicated AWIPS II servers) to provide high data throughput and improved system resiliency. For non-operational standalone installations, these components are installed together on a simple one or two-server configuration.

1.5.1 AWIPS II Data Receipt Architecture

NWS forecast offices receive data from three sources: the Satellite Broadcast Network (SBN); the LDAD network; and the Open Radar Product Generator (ORPG)/Supplemental Product Generator (SPG) network. Standalone configurations of AWIPS II receive data from a single source: the Unidata IDD. Regardless of the data source, the received data is fed to data ingest, which is performed by the EDEX software. This basic data receipt concept is shown below.

image

Figure 1.5.1-1. AWIPS II Data Receipt As shown in Figure 1.5.1-1:

  1. The LDM obtains a data product from an upstream LDM site on the IDD.
  2. The LDM writes the data to file in Raw Data Storage.
  3. The LDM uses edexBridge to post a “data available” message to the Qpid message broker.
  4. The EDEX Ingest process obtains the “data available” message from Qpid and removes the message from the message queue.
  5. The EDEX Ingest process obtains the data files from Raw Data Storage.

This architecture provides separation between data sources and ingest processing. Any data source, not just the LDM/IDD, can follow this architecture to provide data for EDEX to process.

1.5.2 AWIPS II Data Decoding Architecture

Data decoding is defined as the process of converting data from a raw format into a decoded format that is usable by CAVE. In AWIPS II, data decoding is performed by the EDEX Ingest proessing (ingest and ingestGrib).

image

Figure 1.5.2-1. AWIPS II Data Decoding

As shown in Figure 1.5.2-1:

  1. EDEX Ingest obtains the “data available” message from the Qpid message broker, and determines the appropriate data decoder based on the message contents. EDEX Ingest then forwards the message to the chosen decoder. Finally, the message is removed from the message queue.
  2. EDEX Ingest reads the data from Raw Data Storage.
  3. EDEX Ingest decodes the data.
  4. EDEX Ingest forwards the decoded data to Processed Data Storage.
  5. EDEX Ingest sends a message to the CAVE client indicating that newly-decoded data is available.

It is important to note that in AWIPS II all data types are processed by either the standard ingest process, or by the ingestGrib process, which handles all grib message ingestion. Once this data decoding process is complete, clients may obtain and perform additional processing on the data, including displaying data in CAVE.

1.5.3 AWIPS II Processed Data Storage Architecture

Processed Data Storage refers to the persistence of decoded data and occurs in two separate forms: 1) metadata and some decoded data, which is stored in Postgres database tables; and 2) imagery data, gridded forecast data, and certain point data, which is stored in HDF5 files, and is managed by PyPIES. Figure 1.5.3-1. Basic Processed Data Storage

image

Figure 1.5.3-1. AWIPS II Processed Data Storage

As shown in Figure 1.5.3-1, writing to Processed Data Storage involves the following:

For data not stored in HDF5, Steps 1 and 2 are skipped.

For processed data retrieval, the process is revered:

In this case, if the data is not stored in HDF5, then Steps 3 and 4 are skipped.

1.5.4 AWIPS II Data Retrieval Architecture

Data retrieval is the process by which the CAVE client obtains data using the EDEX Request server; the Request server obtains the data from processed data storage (Postgres and HDF5) and returns it to CAVE. The basic data retrieval process is shown in Figure 1.5.4-1.

image

Figure 1.5.4-1. AWIPS II Data Retrieval

As shown above:

  1. CAVE sends a request via TCP to the EDEX Request server.
  2. EDEX Request server obtains the requested metadata via Postgres and stored data via PyPIES.
  3. EDEX Request forwards the requested data directly back to the CAVE client.

For clustered EDEX servers using IPVS, this architecture allows CAVE clients to access any available EDEX Request process, providing an improvement in system reliability and speed. Data retrieval from processed data storage follows the same pattern as data storage: retrieval of HDF5 is handled by PyPIES; retrieval of database data is handled by Postgres.

1.6 AWIPS II Data Purge Architecture

Raw data storage and processed data storage use two different purge mechanisms. For processed data storage, AWIPS II implements a plugin based purge strategy, where the user has the option to change the purge frequency for each plugin individually.

1.6.1 Raw Data Purge

Purging of Raw Data Storage is managed by the LDM user account cron, which executes the ldmadmin scour process, removing data files using an age-based strategy. The directories and retention times for raw data storage are controlled by scour.conf, which is located in the ldm user's ~/etc/ directory. Each entry in scour.conf contains the directory to manage, the retention time and an optional file name pattern for data files.

image

Figure 1.6.1-1 AWIPS II Raw Data Storage Purge

As shown in Figure 1.6.1-1:

  1. An ldm user cron job executes ldmadmin.
  2. ldmadmin executes the LDM scour program.
  3. The LDM scour program deletes outdated data from AWIPS II Raw Data Storage.

1.6.2 Processed Data Purge

Rules for this version-based purge are contained in XML files located in /awips2/edex/data/utility/. The purge is triggered by a quartz timer event that fires at 30 minutes after each hour.

image

Figure 1.6.2-1 AWIPS II Processed Data Storage Purge

As shown in Figure 1.6.2-1:

  1. A Quartz event is triggered in the EDEX Ingest process causing the Purge Service to obtain a purge lock. If the lock is already taken, the Purge Service will exit, ensuring that only a single EDEX Ingest process performs the purge.
  2. The EDEX Purge Service sends a delete message to Postgres.
  3. Postgres deletes the specified data from the database.
  4. If HDF5 data is to be purged, the Purge Service messages PyPIES.
  5. PyPIES deletes the specified HDF5 files.

EDEX logs are located in /awips2/edex/logs/, and data purge events are logged to the file edex-ingest-purge-<yyyymmdd>.log, where <yyyymmdd> is the date stamp. The data purge log can be viewed live with the command edexlog purge.

1.7 File Systems on the EDEX Data Server

The major file systems on the Linux-OS EDEX Data Server are as follows:

The following directory can be mounted on the EDEX server from a NAS:

Additionally, if ingest of a new format is being worked on, you will find these new data types not yet found on the development or integration systems, located in /data_store/experimental.

1.8 EDEX Distribution Files

Distribution files tell EDEX how to invoke a decoder plugin on a raw data file.

The base files are in /awips2/edex/data/utility/edex_static/base/distribution/.

Site-level files are in /awips2/edex/data/utility/edex_static/site//distribution/.

Each plugin has a distribution file that contains the regular expressions to match files for the plugin to process.

Raw files are written to /data_store, and a message is sent via QPID to the EDEX distribution service from the LDM. When a regular expression match is found in a distribution file, the raw data file is placed in a queue for the matching plugin to decode and process. The distribution files are used to match file headers as well as filenames, which is how files dropped into EDEX’s manual endpoint (/awips2/edex/data/manual) are processed.

1.8.1 Editing an EDEX Distribution File

Because these files are in the edex_static/ directory, they have to be manually edited using a text editor. You should not edit the base files; rather, you should copy the base version to your site and then edit the site version. The regular expressions in the distribution files need to correspond with the regular expressions in the LDM pqact.conf file.

If patterns exist in pqact.conf but are not in the distribution files, then raw data files will be written to /data_store but will not be ingested and processed by EDEX. Entries for these noningested files would be written to the unrecognized files log in /awips/edex/logs.

1.8.2 EDEX Distribution File Examples

Surface Obs

obs.xml: Processes any file header that starts with SA or SP, which should match any WMO header that contains METAR data (e.g. SAUS, SPUS, SACN, SAMX).

<requestPatterns xmlns:ns2="group">
  <regex>^S[AP].*</regex>
</requestPatterns> 

Text Data

text.xml: Processes lots of WM patterns. The second pattern ^S[A-CEG-Z].* matches any header that starts with S except for SD or SF, so it also matches the SA and SP files that the obs.xml plugin matches. This means that METARs are processed by both plugins simultaneously.

<requestPatterns>
  <regex>^[ACFNRUW][A-Z].*</regex>
  <regex>^S[ACEG-Z].*</regex>
  <regex>^T[BCX].*</regex>
  <regex>^SF[A-OQ-TVZ].*</regex>
  <regex>^SDUS1.*</regex>
  <regex>^SDUS4[1-6].*</regex>
  <regex>^SDUS9[^7].*</regex>
  <regex>^SFU[^S].*</regex>
  <regex>^SFUS4[^1].*</regex> 
  <regex>^SFP[^A].*</regex>
  <regex>^SFPA[^4].*</regex> 
  <regex>^SFPA4[^1].*</regex>
  <regex>^BMBB91.*</regex> 
  <regex>^N.*</regex>
  <regex>^F[EHIJKLMQVWX].*</regex> 
</requestPatterns> 

Radar Data

radar.xml: Matches files from the LDM. Files that match the pattern ^SDUS4[1-6].* in obs.xml also match ^SDUS[234578]. .* in radar.xml. This means that certain radar products that have text data are processed by both plugins.

<requestPatterns>
  <regex>^SDUS[234578]. .*</regex>
</requestPatterns> 

Digital Hybrid Reflectivity

dhr.xml: Matches some of the radar files that arrive over the IDD. Product IDs 32 = DHR, 80 = STP, 138 = digital storm total precip; these numbers are defined in interface control documents published by the Radar Operations Center at http://www.roc.noaa.gov.

<requestPatterns xmlns:ns2="group">
  <regex>^SDUS8. .... .*</regex>
  <regex>^SDUS5. .... .*</regex>
</requestPatterns>

Grib Data

grib.xml: The grib/grid decoder distribution file matches all numerical grids distributed over the IDD NGRID feed by matching WMO header, and from CONDUIT by matching the .grib file extension.

<requestPatterns>
    <regex>^[EHLMOYZ][A-Z]{3}\d{2}</regex>
    <regex>.*grib.*</regex>
</requestPatterns>
AWIPS II at Unidata :: AWIPS II Manual