IDD Status

Russ Rew
September 25, 2001


This report updates the Internet Data Distribution (IDD) Status and Progress Presentation http://www.unidata.ucar.edu/staff/russ/status/idd-2001-02/ from the February, 2001, Unidata Policy Committee Meeting.

How Big is the IDD?

Participation in the IDD means running Unidata's Local Data Manager (LDM) software to inject, relay, or receive near real-time Unidata data feeds. As of September 25, 2001, the IDD comprised:

There are 20 more LDM hosts than 8 months ago, an increase of about 9%. The number of institutions running LDMs has only increased by about 5%, so some of the growth is due to institutions running multiple LDMs.

The 93 educational institutions on the IDD (86 in the US, 4 in Canada, 1 in Taiwan, 1 in Costa Rica, and 1 in Brazil) are a modest increase over the 90 educational sites (including 6 Canadian sites) for the previous report. Many universities run multiple LDMs. For example, there are 10 sites that run 4 or more LDMs:

 32 LDMs at ucar.edu (RAP(11), UNAVCO(5), COSMIC(4), Unidata(4),
                      MMM(2), COMET(2), JOSS(2), CGD(1), SCD(1))
  8 LDMs at wisc.edu
  7 LDMs at iastate.edu
  7 LDMs at ou.edu
  7 LDMs at uiuc.edu
  5 LDMs at psu.edu
  4 LDMs at washington.edu
  4 LDMs at plymouth.edu
  4 LDMs at lsu.edu
  4 LDMs at cornell.edu
   ...

Of 119 sites reporting hourly IDD statistics, 81% are running a current version (5.1.x), 9% are still running an older version, and 10% aren't reporting which version they run. This is a significant improvement over 8 months ago, when 20% were running older versions and 14% were running unknown versions.

What Data Feeds are Most Requested?

Of 119 sites reporting hourly statistics, here are the number of sites requesting various feed types:

 104 IDS|DDPLUS  NOAAPORT text products
  97 HDS	 model outputs
  87 MCIDAS	 satellite imagery
  63 NEXRAD	 Level 3 products
  60 PROFILER	 Wind profiler data from FSL
  50 NLDN	 Lightning
  25 DIFAX	 generated replacement for DIFAX
  24 FNEXRAD	 NEXRAD floater
  18 ACARS	 Commercial aircraft data
  17 WSI	 Level 3 products and composites
  14 CONDUIT	 High-resolution model outputs
  11 AFOS	 supposedly obsolete NWS feed
  10 NOGAPS	 FNMOC model outputs, NOGAPS and COAMP
   9 EXP	 experimental feeds
   8 CONDUIT2	 additional CONDUIT data
   4 NEXRD2	 CRAFT level II radar data
   4 GPS	 SuomiNet
   3 FSL3        reserved for NOAA/FSL use

NEXRAD has recently moved up from 5th to 4th in this list, while WSI has moved down from 7th to 10th. FNEXRAD also moved into the top 10, displacing CONDUIT.

Data Feed Volumes

Only a few of the data feeds are responsible for the bulk of IDD data in terms of both numbers of products and Mbytes of data. Here are the average data volumes from the last complete month on the UCAR LDM server (motherlode.ucar.edu), first sorted by number of products/hour, then by number of Mbytes/hour:

     Feed    Products    % of              Feed    Mbytes     % of 
             per hour  Products	                  per hour   Mbytes
				                                   
       HDS    8549	29.6	         CONDUIT  148.5       43.4 
    NEXRAD    7591	26.3	             HDS   89.3       26.1 
IDS|DDPLUS    6626	22.9	          NEXRAD   75.4       22.0 
   CONDUIT    5840	20.2	      IDS|DDPLUS    6.4        1.9 
   FNEXRAD     187	 0.6	          MCIDAS    4.9        1.4 
       WSI      38	 0.1	         FNEXRAD    4.9        1.4 
    MCIDAS      12	 0.04	           ACARS    4.2        1.2 
  PROFILER      10	 0.03	           DIFAX    4.2        0.9 
      NLDN      10	 0.03	             WSI    3.3        1.0 
     ACARS       7	 0.02	        PROFILER    1.8        0.5 
     DIFAX       6	 0.01	            NLDN    0.2        0.1 
   				                                   
     total   28874     100.0	           total  342.2      100.0

It should be noted that most sites do not get the CONDUIT feed, which is currently considered experimental.

There is a general perception that the amount of data on the IDD is increasing, but that has actually not been the case through the UCAR server during the past year, due to somewhat special circumstances. UCAR had been subscribing to the WSI radar feed and all the WSI data had been handled through the UCAR LDM, but after March 2001 the contract was changed to only get composites from WSI, which resulted in a significant decrease in both the number and volume of products. Here are the monthly totals for all feeds and for the WSI feed through the UCAR server over the last year:

                  All feeds           WSI feed   
    Month       Prods  Mbytes       Prods  Mbytes
                /hr     /hr         /hr     /hr  
                                                 
    2000/09    47484  604.9        25709  282.2  
    2000/10    46422  560.4        25828  267.7  
    2000/11    44197  518.1        24368  235.4  
    2000/12    30425  245.0        19204  173.7  
    2001/01    40430  472.3        19835  180.3  
    2001/02    46121  512.0        20982  195.4  
    2001/03    48192  574.8        20314  197.0  
    2001/04    25715  330.8           30    2.1  
    2001/05    28939  395.7           31    2.7  
    2001/06    29465  395.2           32    2.6  
    2001/07    28085  341.8           32    2.7  
    2001/08    28874  342.2           38    3.3  

Other changes in IDD data volume reflect

Product Latencies

A data product's latency is defined as the time between a products arrival at its destination and the time at which the product was first added to the source site's queue for delivery to downstream hosts. Currently about 100 IDD hosts (out of 224 in the IDD) send in hourly product statistics, including the maximum product latency that occurred during the hour for each data stream. We are also testing a new program that sends product latencies for each feed every 15 seconds (see, for example, NOGAPS, DIFAX, NLDN latencies to the UCAR server), in preparation for using near real-time latency information to optimize routing topologies.

As an example of hourly product latencies, for the IDS|DDPLUS text data feed from 7:00am to 8:00am MDT on Sept 24, 2001, of all the products delivered:

 58% had latency less than  1 second
 67% had latency less than  2 seconds
 76% had latency less than  5 seconds
 79% had latency less than 15 seconds
 85% had latency less than 30 seconds
 90% had latency less than 60 seconds
 97% had latency less than  5 minutes
 99% had latency less than  6 minutes
100% had latency less than 23 minutes

There are multiple causes for high product latencies, some of them feed-specific. For example, when all the outputs from a model run are dumped into the product queue at the source site at the same time, product latencies will rise as products sit in the queue waiting for bandwidth to downstream hosts. Other factors that can lead to large latencies include:

Here's an example of product latencies over a more extended period (one week ending on Sept 24, 2001) from the IDS|DDPLUS text data feed, for over 100,000,000 data products delivered to 98 sites. (This may not have been a typical week, because the LDM on the UCAR server was rebooted and a 75-second clock skew was noticed on another high-level source site.) Of all the products delivered:

 43% had latency less than  1 second
 48% had latency less than  2 seconds
 54% had latency less than  5 seconds
 61% had latency less than 15 seconds
 66% had latency less than 30 seconds
 73% had latency less than 60 seconds
 85% had latency less than  5 minutes
 93% had latency less than 15 minutes
 99% had latency less than 52 minutes
100% had latency less than 89 minutes

It can be difficult to interpret aggregate statistics such as these for long time periods that include multiple confounding factors from transient problems. Latency statistics for individual sites sometimes show simpler patterns that can aid in diagnosing real problems. Ultimately, users' degree of satisfaction with the timeliness of the data has often been a better indicator of problems with the IDD than aggregate latencies.


Last modified: Fri Sep 28 12:44:19 MDT 2001