Table of contents Previous: Data File Access Next: Applying Lessons Learned

14 Data Ingest and Storage

In this document we describe how you manage McIDAS data files on your system.

Table of Contents

Basic Concepts

File Routing System

Data File Scouring

Access to Data other than the Unidata-Wisconsin Datastream


Basic Concepts

In Data File Access we described what McIDAS data files you can expect to find on your system and how you can access them. In this section we describe how you can alter the types and numbers of data files on your system.

McIDAS data products can be classified into four general types:

These data types are realized in specific file types: Several instances of the product type AREA types are transmitted by the Unidata IDD to end user sites. The products included in the transmission are, for the most part, not controllable by the end user. What and how many files stored locally are, however, configurable by the end user.

File Routing System

Control of which files are written to end user's file systems is handled by the McIDAS File Routing System. How many files are maintained on end user's file systems is controlled by both the File Routing System and by Data Scouring.

The McIDAS file routing system, as used by Unidata, is composed of two subsystems:

File Routing Table

Data in the Unidata-Wisconsin channel are transmitted in identifiable units knows as products. Data products are uniquely identified by a two-character Product Code.

The Product Codes assigned to data objects in the Unidata-Wisconsin datastream are:

Product Code   Product Description                     McIDAS SYSKEY.TAB Vals
--------------+---------------------------------------+----------------------
   CA          CIMSS Cloud Top Pressure                2173 2174 2175
   CB          CIMSS Precipitable Water                2176 2177 2178
   CC          CIMSS Sea Surface Temperature           2179 2180 2181
   CD          CIMSS Lifted Index                      2182 2183 2184
   CE          CIMSS CAPE                              2185 2186 2187
   CF          CIMSS Ozone                             2188 2189 2190
   CG          CIMSS Wildfire ABBA Nth Hemishere       2191 2192 2193
   CH          CIMSS Wildfire ABBA Sth Hemishere       2194 2195 2196
   CI          GOES-E/W Infrared Composite             2161 2162 2163
   CV          GOES-E/W Visible Composite              2164 2165 2166
   CW          GOES-E/W Water Vapor Composite          2167 2168 2169
   N1          GOES-E IR/Topography composite          2125 2126 2127
   N2          GOES-E VIS/Topography composite         2128 2129 2130
   N3          GOES-W IR/Topography composite          2131 2132 2133
   N4          GOES-W VIS/Topography composite         2134 2135 2136
   N5          MDR/Topography composite                2137 2138 2139
   N6          Mollweide/Topography composite          2140 2141 2142
   N7          GOES-E/W IR/Topography composite        2155 2156 2157
   N8          GOES-E/W VIS/Topography composite       2158 2159 2160
   RL          NEXRCOMP: 6 km Nat. Base Refl. Comp.    2361 2362 2363
   RN          NEXRCOMP: 10 km RCM Composite           2364 2365 2366
   RO          NEXRCOMP: 1 km Flt. Base Refl. Comp.    2367 2368 2369
   UA          Educational floater I                   2101 2102 2103
   UB          GOES-W Western US Water Vapor           2104 2105 2106
   UC          Educational floater II                  2170 2171 2172
   UI          GOES-E North America Infrared           2143 2144 2145
   UR          Research floater                        2107 2108 2109
   UV          GOES-E North America Visible            2146 2147 2148
   UW          GOES-E North America Water Vapor        2149 2150 2151
   UX          Global Mollweide Infrared Composite     2122 2123 2124
   UY          Global Mollweide Water Vapor Composite  2152 2153 2154
   U1          Antarctic composite                     2119 2120 2121
   U3          Manually Digitized Radar                2110 2111 2112
   U5          GOES-W Western US Infrared              2113 2114 2115
   U9          GOES-W Western US Visible               2116 2117 2118
Decoders convert the image objects in the datastream to data files of the type AREA. On Unix, the decoders for doing this are provided in the LDM-McIDAS package.

The image decoder will use the McIDAS File Routing Facility, if it is told to do so, and if it exists and is writable by the user running the decoder, to determine the output name space for the files that it creates. If the File Routing Facility does not exist (i.e. it has not been created or is not in the directory in which the decoders write their output), the decoders will create products based on the default name that each product is tagged with in the datastream.

The default names for products in the datastream are presented in Unidata-Wisconsin Data Stream Products

The result of having a file routing table is conceptualized in the following diagram:

The McIDAS file redirection facility can be used in combination with the file routing table to locate files ingested and decoded from the Unidata-Wisconsin datastream. This is schematically represented in:

The File Routing Facility is created/updated/listed with the McIDAS ROUTE command. The routing facility configuration is embodied in the file ROUTE.SYS.

ROUTE is used to:

Routing initialization is simply the process of creating a new, empty routing table, ROUTE.SYS. Routing information is useful to see how the data ingest system is configured. Adding, deleting releasing, and suspending entries in the routing system are the real heart of data ingest management. We deal with each of these functions in the following.

Adding and Deleting Routing Entries

The syntax for using ROUTE to ADD or DELete products from the routing table are:

ROUTE ADD  pcode type begin end file keywords "description
ROUTE DEL  pcode1 pcode2 ... pcoden
where:

type  - product type: AREA, GRID, MD, or TEXT
begin - beginning file number for types: AREA, GRID, and MD
end   - ending file number for types: AREA, GRID, and MD
file  - file name for type: TEXT
The keywords supported by ROUTE are:

FORM=ALL - for LIST option, List expanded form of product entry

CC=code - decoder condition code, range 1 to 7 (default=1) 1 - decoder ended normally -| If condition is met: 2 - partial failure during decoding | 3 - either 1 or 2 above | * - Increment cylinder 4 - total failure (nothing decoded) |-> * - Execute Post Proce> 5 - either 1 or 4 above | (if specified) 6 - either 2 or 4 above | 7 - all conditions -|

PP=file xcute | execute a post process batch file where file = name of batch file to execute, up to 12 characters xcute = NO -> execute batch file independent of condition code = YES -> execute batch file based on condition code (default)

SYS=bword eword cword | set file number bounds in SYSKEY table where bword = SYSKEY word for file number beginning value eword = SYSKEY word for file number ending value cword = SYSKEY word for file number current value

The control of the output name space comes from the ability to control the output file numbers for AREA, GRID, and MD files and from file for TEXT files. Since all non-image products were removed from the Unidata-Wisconsin datastream in summer 1999, ROUTE is mainly used for products of type AREA.

By setting the range of file numbers that, say, AREA files can occupy, one directly controls how many of them are to be filed on disk. The same holds true for GRID and MD files, but MD files required some further considerations.

The file routing system was designed to allow the user to configure it to automatically run McIDAS processes upon successful receipt of data products. This is where the CC= and PP= keywords are used. PP= specifies the name of a McIDAS BATCH file that is to be run upon successful receipt of a product. CC= defines what successful actually means.

BATCH files run automatically by the routing system are referred to as ROUTE PostProcess BATCH files.

The invocation syntax for these BATCH file is always:

BATCH pcode fname DAY TIME ccode "bname
Here:

pcode -> the product code of the product received
fname -> the number of the file just written  (AREA, GRID, and MD)
         or the file name (TEXT)
DAY   -> the date the file was received in YYDDD
TIME  -> the time the file was received in HHMMSS
ccode -> the decoder status code
bname -> the name of the BATCH file to run
The SYS= keyword, if specified, tells the data decoder to update entries in another McIDAS facility, the System Key Table. The System Key Table is embodied in the file SYSKEY.TAB. Unidata uses SYSKEY.TAB as a storage location for information on what data is currently available. SYSKEY.TAB is sharable among all versions of McIDAS.

The number tuple managed by the SYS= keyword represents the beginning, ending, and current file numbers for products of type AREA and the file number, time, and day for products of type GRID and MD. Unidata does not use SYSKEY.TAB entries for files of type TEXT.

Examples of using ROUTE to configure data file ingestion:

ROUTE ADD U1 AREA 190 199 CC=3 SYS=2119 2120 2121 "Antarctic IR Composite
ROUTE ADD U5 AREA 130 139 CC=3 SYS=2113 2114 2115 PP=GW-IR.BAT "GOES-West US IR Band 4
ROUTE ADD U9 AREA 120 129 CC=3 SYS=2116 2117 2118 PP=GW-VIS.BAT "GOES-West US Visible
Unidata provides a McIDAS BATCH file, DROUTE.BAT, that is designed to aid the user in configuring their site's routing table for all of the products available by the Unidata IDD.

ROUTE LIST presents the user with a snapshot at the routing table at the time it runs:

ROUTE LIST

S Pd         Description         Range       Last      Received  Post Process C
- -- ------------------------- --------- ------------ ---------- ------------ -
  CA Cloud Top Pressure        1100-1109 AREA1109     02356 1715 CTP.BAT      3
  CB Precipitable Water        1110-1119 AREA1111     02356 1616 PW.BAT       3
  CC Sea Sfc. Temperature      1120-1129 AREA1121     02356 1532 SST.BAT      3
  CD Lifted Index              1130-1139 AREA1137     02356 1633 LI.BAT       3
  CE CAPE                      1140-1149 AREA1141     02356 1634 CAPE.BAT     3
  CF Ozone                     1150-1159 AREA1152     02356 1651 OZONE.BAT    3
  CG NH Wildfire ABBA          1190-1199 AREA1192     02356 1536 WFABBA.BAT   3
  CH SH Wildfire ABBA          1200-1209 AREA1203     02356 1552 WFABBA.BAT   3
  CI GOES-E/W IR Composite       80-89   AREA0081     03202 2221     none     3
  CV GOES-E/W VIS Composite      90-99   AREA0096     03202 2221     none     3
  CW GOES-E/W H2O Composite      70-79   AREA0076     03202 2220     none     3
  LD NLDN Lightning Flashes      71-71   MDXX0072     03202 2231     none     3
s MA Surface MD data            default      none        none    SFC.BAT      3
  N1 GOES-East IR/TOPO Composi  220-229  AREA0228     03202 2131     none     3
  N2 GOES-East VIS/TOPO Compos  230-239  AREA0230     03202 2130     none     3
  N3 GOES-West IR/TOPO Composi  240-249  AREA0240     03202 2221     none     3
  N4 GOES-West VIS/TOPO Compos  250-259  AREA0250     03202 2221     none     3
  N5 MDR/TOPO Composite         260-269  AREA0265     03202 2206     none     3
  N6 Mollweide IR/TOPO Composi  270-279  AREA0272     03202 2237     none     3
  N7 GOES-E/W IR/TOPO Composit  280-289  AREA0287     03202 2221     none     3
  N8 GOES-E/W VIS/TOPO Composi  290-299  AREA0293     03202 2221     none     3
  NF Global Initialization Gri  101-102      none        none    GLOBAL.BAT   3
  NG Early Domestic Products      1-2        none        none    ADDGRID.BAT  3
  R1 Base Reflectivity Tilt 1   300-339      none        none        none     3
  R2 Base Reflectivity Tilt 2   340-379      none        none        none     3
  R3 Base Reflectivity Tilt 3   380-419      none        none        none     3
  R4 Base Reflectivity Tilt 4   420-459      none        none        none     3
  R5 Composite Reflectivity     460-499      none        none        none     3
  R6 Layer Reflect SFC-24 K ft  500-539      none        none        none     3
  R7 Layer Reflect 24-33 K ft   540-579      none        none        none     3
  R8 Layer Reflect 33-60 K ft   580-619      none        none        none     3
  R9 Echo Tops                  620-659      none        none        none     3
  RA Vertical Liquid H2O        660-699      none        none        none     3
  RB 1-hour Surface Rain Total  700-739      none        none        none     3
  RC 3-hour Surface Rain Total  740-779      none        none        none     3
  RD Storm Total Rainfall       780-819      none        none        none     3
  RE Radial Velocity Tilt 1     820-859      none        none        none     3
  RF Radial Velocity Tilt 2     860-899      none        none        none     3
  RG Radial Velocity Tilt 3     900-939      none        none        none     3
  RH Radial Velocity Tilt 4     940-979      none        none        none     3
  RI 248 nm Base Reflectivity   980-1019     none        none        none     3
  RJ Storm-Rel Mean Vel Tilt 1 1020-1059     none        none        none     3
  RK Storm-Rel Mean Vel Tilt 2 1060-1099     none        none        none     3
  RL 6 km Nat. Base Refl. Comp 1160-1169     none        none        none     3
s RM Mandatory Upper Air MD da  default      none        none    MAN.BAT      3
  RN 10 km RCM Composite       1170-1179     none        none        none     3
  RO 1 km Flt. Base Refl. Comp 1180-1189     none        none        none     3
s RS Significant Upper Air MD   default      none        none    SIG.BAT      3
  U1 Antarctic IR Composite     190-199  AREA0192     02356 1512     none     3
  U2 FSL2 hourly wind profiler  default  MDXX0082     02356 1617     none     3
  U3 Manually Digitized Radar   200-209  AREA0201     02356 1705 MDR.BAT      3
  U5 GOES-West US IR Band 4     130-139  AREA0130     02356 1635 GW-IR.BAT    3
  U6 FSL2 6-minute Wind profil  default  MDXX0092     02356 1704     none     3
  U9 GOES-West US Visible       120-129  AREA0120     02356 1635 GW-VIS.BAT   3
  UA Educational Floater I      160-169  AREA0160     02356 1636     none     3
  UB GOES-West US Water Vapor   170-179  AREA0174     02356 1635 GW-WV.BAT    3
  UC Educational Floater II      60-69   AREA0063     02356 1637     none     3
  UI GOES-East US IR Band 4     150-159  AREA0158     02356 1630 GE-IR.BAT    3
  UR Research Floater           180-189      none        none        none     3
s US Undecoded SAO Data         default      none        none        none     1
  UV GOES-East US Visible       140-149  AREA0140     02356 1630 GE-VIS.BAT   3
  UW GOES-East US Water Vapor   210-219  AREA0218     02356 1630 GE-WV.BAT    3
  UX Mollweide Composite IR     100-109  AREA0102     02356 1330 MOLL.BAT     3
  UY Mollweide Composite H2O    110-119  AREA0112     02356  435     none     3
In addition to defining the correspondence between product codes and file names, the routing table contains information on: The short listing above does not show the last item in this list. To see what SYSKEY.TAB entries will be updated, you have to specify the FORM= keyword in the ROUTE invocation. Here is an example for the GOES-West Water Vapor product:

ROUTE LIST UB FORM=ALL

Product: UB    Type: AREA    Status: Active       Condition Code: 3
Cylinder Begin: 170        Cylinder End: 179        Current Position: 175
Product Description: GOES-West Western US H2O
Creation Day: 100301         Creation Time: 20:00:00
Received Day: 2000301        Received Time: 20:55:27
Post Processing Batch File: GW-WV.BAT       Condition Code Testing: Yes
SYSKEY word | Cylinder Begin: 2104  Cylinder End: 2105  Current Position: 2106
route.k: Done

System Key Table

McIDAS provides a utility that is used to store system-wide defaults for a variety of McIDAS parameters, the System Key Table. Unidata uses the System Key Table to store information about the most recently decoded data from LDM-McIDAS and XCD processes. The System Key Table entries used by Unidata are listed in the file PRODUCT.DAT that is located in the data directory of the McIDAS distributions.

The System Key Table, realized in the file SYSKEY.TAB, is created and manipulated with a pair of McIDAS commands:

SYSKEY's sole purpose is the creation of a new copy of SYSKEY.TAB. SYSKEY uses an ASCII file, SYSKEY.DOC, as a template for the new verson of SYSKEY.TAB. SYSKEY.DOC can be found in the data directory of the user mcidas. The McIDAS command SYSVAL allows users to interrogate and modify entries in SYSKEY.TAB.

To list SYSKEY.TAB entries, one uses the SYSVAL LIST option:

SYSVAL LIST bword eword
Here:

bword - the beginning SYSKEY.TAB word to list (default=1)
eword - the ending SYSKEY.TAB word to list (default=bword)
SYSVAL can also modify entries in SYSKEY.TAB:

SYSVAL CHANGE word value
Here:

word  - the SYSKEY.TAB word to modify (no default)
value - the new value for word (no default)
One last note regarding use of SYSKEY.TAB entries. The McIDAS command interpreter provides access to SYSKEY.TAB entries in the same way as for values in the String Table. The following example demonstrates the use of SYSKEY.TAB entries that contain information on currently received surface SAO/METAR data:

SPC T NA #SYS(2002) MDF=#SYS(2001) DAY=#SYS(2003)
The SYSKEY.TAB entries used in this example refer to:

#SYS(2002) - current SAO/METAR data time [HH]
#SYS(2001) - current SAO/METAR MD file number
#SYS(2003) - current SAO/METAR date [YYDDD]

Data File Scouring

In File Routing System, you were introduced to the facility that allows you to control how many of each kind of data file is allowed to be filed on your system. In this section, we discuss how you control the numbers of files that you keep on your system.

McIDAS Scheduler

McIDAS contains a built-in Scheduler facility that allows for the execution of McIDAS and non-McIDAS applications based on the system clock. McIDAS also provides data scouring routines that, when run, will trim files to user-specified limits. The combination of the scheduler running the data scouring routines can be used to control the numbers of files on your system.

The scheduler facility is maintained by a set of three routines:

The scheduler entries are maintained in the disk file SKEDFILE.

SKU provides five basic functions:

SKE has the sole responsibility for putting new entries in the scheduler. SKE's invocation syntax is:

SKE day time repeat int <keywords> "command
Here:
day     - Julian day to initiate command [CCYYDDD]
time    - time to initiate command [HH:MM:SS]
repeat  - number of times the command should execution; if you enter 999999 the
          word MANY replaces it in the header (default=1)
int     - time interval between command executions [DDDHH:MM:SS]
          (def=1:00:00, meaning 1 hour)
command - command to execute
The keywords supported by SKE are:
FILE= - scheduler file name; valid in McIDAS-X only; see the McIDAS-X
        Users Guide for details (default=SKEDFILE)
ID=   - 4-digit number to identify the command in the scheduler
        (default=system assigned number)
NAME= - user initials (user id) to use for local/host commands
PROJ= - project number command is executed under (default=current)
TOL=  - late tolerance, HH:MM:SS; if the command cannot execute within
        the designated time, it is ignored (default=1 hour)
In practice, one typically does not specify any keyword except TOL= when creating scheduler entries. The TOL= keyword specifies the time interval, which starts at the time a command is supposed to be run by the scheduler, in which a command can run. If for some reason a command can not be run within this time interval (for instance, when the scheduler is off), the scheduler will not try to run it.

For data scouring, one specifies that a souring command is to be run at least once-per-day. The time that the command runs is user-selectable, but Unidata recommends that it be sometime close to, but after 0 Z.

SKL allows one to list out the entries in the user's scheduler easily. For instance, the following shows a listing of our recommended entries for the mcidas supervisory user:

SKL FORM=ALL
 *** SCHEDULER IS OFF ***          MESSAGE DEVICE IS: N
T#  ID  XS NEXT EXECUTN # REM INTERVAL  TOL  NAME PROJ COMMAND TEXT...
-- ---- -- ------------ ----- -------- ----- ---- ---- ------------
 1    1    96314  13000  MANY  1000000  BIG  USER    0 DOQTL  1 30 3
 1    2    96314  13500  MANY  1000000  BIG  USER    0 DOQTL 31 60 3
 1    3    96314  14000  MANY  1000000  BIG  USER    0 DOQTL 71 80 3
 1    4    96314  14500  MANY  1000000  BIG  USER    0 DOQTL 81 90 3
 1    5    96314  15000  MANY  1000000  BIG  USER    0 IGU DEL #+GFILE
 1    6    96314  15500  MANY  1000000  BIG  USER    0 LWU DELETE VIRT9001
 1    7    96314  15600  MANY  1000000  BIG  USER    0 LWU DELETE VIRT9002
 1    8  S 96314  10000  MANY    30000   259 USER    0 RUN UX 998 DEBUG FILE=G
                                                      ETBYFTP.MCB
--END OF LIST
For sites running McIDAS-XCD, the recommended scheduler entries look like:

SKL FORM=ALL
 *** SCHEDULER IS OFF ***          MESSAGE DEVICE IS: N
T#  ID  XS NEXT EXECUTN # REM INTERVAL  TOL  NAME PROJ COMMAND TEXT...
-- ---- -- ------------ ----- -------- ----- ---- ---- ------------
 1    1    96314  13500  MANY  1000000  BIG  USER    0 QRTMDG MD 1 90 3
 1    2    96314  14000  MANY  1000000  BIG  USER    0 QRTMDG GRID 1 10 2
 1    3    96314  14000  MANY  1000000  BIG  USER    0 QRTMDG GRID 101 102 2
 1    4    96314  14000  MANY  1000000  BIG  USER    0 QRTMDG GRID 5100 5400 1
 1    5    96314  15000  MANY  1000000  BIG  USER    0 IGU DEL #+GFILE
 1    6    96314  15500  MANY  1000000  BIG  USER    0 LWU DELETE VIRT9001
 1    7    96314  15600  MANY  1000000  BIG  USER    0 LWU DELETE VIRT9002
 1    8  S 96314  10000  MANY    30000   259 USER    0 RUN UX 998 DEBUG FILE=G
                                                      ETBYFTP.MCB
--END OF LIST
You will note that from the listings above that the scheduler facility is not ON or active. In order to activate scheduler entries, one has to run SKU as follows:

SKU ON

Running McIDAS Scouring Routines from Unix cron

McIDAS-X allows one to run McIDAS commands from the Unix command line. This means that one can run the data scouring commands illustrated in the scheduler listing above from a Unix shell script kicked off by the Unix cron facility.

Since the scouring routines are McIDAS applications, they need McIDAS environment information in order to run. This is easily setup in a shell script by defining the environment variables MCDATA, MCPATH, and PATH.

Unidata provides an example shell script, mcscour.sh that includes all of the scouring functionality in the scheduler listing above. The example scouring contained in mcscour.sh is intended to be run once-per-day, typically in the middle of the night.

Access to Data other than the Unidata-Wisconsin Datastream

Do I have access to data other than that in the Unidata-Wisconsin Datastream?

The Unidata Internet Data Distribution system (IDD) gives the Unidata user community free access to all of the Family Of Services (FOS)/NOAAPORT datastreams that originate from the National Weather Service:

National Lightning Detection Network lightning flash data by way of: Users can also subscribe to WSI Corporation for Level III NEXRAD data products.

The twenty Level III NEXRAD products include:

All of the above data services are described in more detail at:

Data Available Through Unidata

For many years, Unidata McIDAS users were unable to take advantage of this wealth of data instead having to live with only the products available in the Unidata-Wisconsin datastream. With the LDM-McIDAS package developed at Unidata and SSEC's McIDAS-XCD package, this situation has changed dramatically. Unidata McIDAS users can now decode all the data available in these streams. Unidata McIDAS users' limitation to data is now simply a matter of their departmental processing capabilities and storage capacities.

The Unidata McIDAS distribution also provides routines that allow for easy conversion of non-McIDAS data into McIDAS compatible formats.

The McIDAS Abstract Data Distribution Eenvironment (ADDE) perhaps provides for the simplest access to the vast stores of data contained in NOQAPORT. Users allowed by cooperating institutions can display and analyze data without having to transfer it to local storage (although copying the data for local storage is also provided by ADDE).

Users can also get access to archived Unidata-Wisconsin datastream products. Interested users should consult the Data Recovery page maintained by Unidata.


Table of contents Previous: Data File Access Next: Applying Lessons Learned