This report updates the Internet Data Distribution (IDD) Status and Progress Presentation http://www.unidata.ucar.edu/staff/russ/status/idd-2001-02/ from the February, 2001, Unidata Policy Committee Meeting.
Participation in the IDD means running Unidata's Local Data Manager (LDM) software to inject, relay, or receive near real-time Unidata data feeds. As of September 25, 2001, the IDD comprised:
The 93 educational institutions on the IDD (86 in the US, 4 in Canada, 1 in Taiwan, 1 in Costa Rica, and 1 in Brazil) are a modest increase over the 90 educational sites (including 6 Canadian sites) for the previous report. Many universities run multiple LDMs. For example, there are 10 sites that run 4 or more LDMs:
32 LDMs at ucar.edu (RAP(11), UNAVCO(5), COSMIC(4), Unidata(4), MMM(2), COMET(2), JOSS(2), CGD(1), SCD(1)) 8 LDMs at wisc.edu 7 LDMs at iastate.edu 7 LDMs at ou.edu 7 LDMs at uiuc.edu 5 LDMs at psu.edu 4 LDMs at washington.edu 4 LDMs at plymouth.edu 4 LDMs at lsu.edu 4 LDMs at cornell.edu ...
Of 119 sites reporting hourly IDD statistics, 81% are running a current version (5.1.x), 9% are still running an older version, and 10% aren't reporting which version they run. This is a significant improvement over 8 months ago, when 20% were running older versions and 14% were running unknown versions.
Of 119 sites reporting hourly statistics, here are the number of sites requesting various feed types:
104 IDS|DDPLUS NOAAPORT text products 97 HDS model outputs 87 MCIDAS satellite imagery 63 NEXRAD Level 3 products 60 PROFILER Wind profiler data from FSL 50 NLDN Lightning 25 DIFAX generated replacement for DIFAX 24 FNEXRAD NEXRAD floater 18 ACARS Commercial aircraft data 17 WSI Level 3 products and composites 14 CONDUIT High-resolution model outputs 11 AFOS supposedly obsolete NWS feed 10 NOGAPS FNMOC model outputs, NOGAPS and COAMP 9 EXP experimental feeds 8 CONDUIT2 additional CONDUIT data 4 NEXRD2 CRAFT level II radar data 4 GPS SuomiNet 3 FSL3 reserved for NOAA/FSL use
NEXRAD has recently moved up from 5th to 4th in this list, while WSI has moved down from 7th to 10th. FNEXRAD also moved into the top 10, displacing CONDUIT.
Only a few of the data feeds are responsible for the bulk of IDD data in terms of both numbers of products and Mbytes of data. Here are the average data volumes from the last complete month on the UCAR LDM server (motherlode.ucar.edu), first sorted by number of products/hour, then by number of Mbytes/hour:
Feed Products % of Feed Mbytes % of per hour Products per hour Mbytes HDS 8549 29.6 CONDUIT 148.5 43.4 NEXRAD 7591 26.3 HDS 89.3 26.1 IDS|DDPLUS 6626 22.9 NEXRAD 75.4 22.0 CONDUIT 5840 20.2 IDS|DDPLUS 6.4 1.9 FNEXRAD 187 0.6 MCIDAS 4.9 1.4 WSI 38 0.1 FNEXRAD 4.9 1.4 MCIDAS 12 0.04 ACARS 4.2 1.2 PROFILER 10 0.03 DIFAX 4.2 0.9 NLDN 10 0.03 WSI 3.3 1.0 ACARS 7 0.02 PROFILER 1.8 0.5 DIFAX 6 0.01 NLDN 0.2 0.1 total 28874 100.0 total 342.2 100.0
It should be noted that most sites do not get the CONDUIT feed, which is currently considered experimental.
There is a general perception that the amount of data on the IDD is increasing, but that has actually not been the case through the UCAR server during the past year, due to somewhat special circumstances. UCAR had been subscribing to the WSI radar feed and all the WSI data had been handled through the UCAR LDM, but after March 2001 the contract was changed to only get composites from WSI, which resulted in a significant decrease in both the number and volume of products. Here are the monthly totals for all feeds and for the WSI feed through the UCAR server over the last year:
All feeds WSI feed Month Prods Mbytes Prods Mbytes /hr /hr /hr /hr 2000/09 47484 604.9 25709 282.2 2000/10 46422 560.4 25828 267.7 2000/11 44197 518.1 24368 235.4 2000/12 30425 245.0 19204 173.7 2001/01 40430 472.3 19835 180.3 2001/02 46121 512.0 20982 195.4 2001/03 48192 574.8 20314 197.0 2001/04 25715 330.8 30 2.1 2001/05 28939 395.7 31 2.7 2001/06 29465 395.2 32 2.6 2001/07 28085 341.8 32 2.7 2001/08 28874 342.2 38 3.3
Other changes in IDD data volume reflect
A data product's latency is defined as the time between a products arrival at its destination and the time at which the product was first added to the source site's queue for delivery to downstream hosts. Currently about 100 IDD hosts (out of 224 in the IDD) send in hourly product statistics, including the maximum product latency that occurred during the hour for each data stream. We are also testing a new program that sends product latencies for each feed every 15 seconds (see, for example, NOGAPS, DIFAX, NLDN latencies to the UCAR server), in preparation for using near real-time latency information to optimize routing topologies.
As an example of hourly product latencies, for the IDS|DDPLUS text data feed from 7:00am to 8:00am MDT on Sept 24, 2001, of all the products delivered:
58% had latency less than 1 second 67% had latency less than 2 seconds 76% had latency less than 5 seconds 79% had latency less than 15 seconds 85% had latency less than 30 seconds 90% had latency less than 60 seconds 97% had latency less than 5 minutes 99% had latency less than 6 minutes 100% had latency less than 23 minutes
There are multiple causes for high product latencies, some of them feed-specific. For example, when all the outputs from a model run are dumped into the product queue at the source site at the same time, product latencies will rise as products sit in the queue waiting for bandwidth to downstream hosts. Other factors that can lead to large latencies include:
Here's an example of product latencies over a more extended period (one week ending on Sept 24, 2001) from the IDS|DDPLUS text data feed, for over 100,000,000 data products delivered to 98 sites. (This may not have been a typical week, because the LDM on the UCAR server was rebooted and a 75-second clock skew was noticed on another high-level source site.) Of all the products delivered:
43% had latency less than 1 second 48% had latency less than 2 seconds 54% had latency less than 5 seconds 61% had latency less than 15 seconds 66% had latency less than 30 seconds 73% had latency less than 60 seconds 85% had latency less than 5 minutes 93% had latency less than 15 minutes 99% had latency less than 52 minutes 100% had latency less than 89 minutes
It can be difficult to interpret aggregate statistics such as these for long time periods that include multiple confounding factors from transient problems. Latency statistics for individual sites sometimes show simpler patterns that can aid in diagnosing real problems. Ultimately, users' degree of satisfaction with the timeliness of the data has often been a better indicator of problems with the IDD than aggregate latencies.