NLDM Progress Report

October 12, 2004

Anne Wilson



The NLDM network is successfully relaying seven major IDD data feeds among six relay machines.  Two additional machines use the same relay technology to receive statistics for display.  The network has been robust and reliable.

The network currently consists of the following eight machines:

Hostname
Location
Function
OS
imogene.unidata.ucar.edu
Boulder, CO
ingest and relay
Linux
atm.geo.nsf.gov
Washington, D.C.
ingest and relay
Solaris
ldm.iihr.uiowa.edu
Iowa City, IO
relay
Linux
tempest.aos.wisc.edu
Madison, WI
relay
Linux
bigbird.tamu.edu
College Station, TX
relay
Linux
methost24.met.sjsu.edu
San Jose, CA
relay
Linux
joey.unidata.ucar.edu
Boulder, CO
statistics processing
Linux
conan.unidata.ucar.edu
Boulder, CO
statistics display
Solaris


The network is relaying
the CONDUIT, CRAFT, HDS, NEXRAD, IDS|DDPLUS, UNIWISC and NIMAGE data feeds.  This selection of feed types shows that the network can handle a variety of types of streams.  The CONDUIT stream is a bursty, high volume binary stream.  CRAFT, HDS, and NEXRAD are more continuous, high volume, binary streams.  IDS|DDPLUS is a continuous stream of very small text products.  UNIWISC is a sparse, relatively low volume, binary stream of products of around 3MB.   NIMAGE is also a sparse binary stream, but contains the largest data products in the IDD, with product sizes up to around 20MB.

The statistics page has been augmented to allow comparison of the same statistics across feed types for a particular machine.   Bringing up multiple copies of the statistics pages allows comparison across machines and provides a window to network performance as a whole.

Latencies and reception are good.  A recent examination showed 99% of all products arriving at all sites within 5 seconds for the most part.  The exceptions were for machines having sporadic network issues or periods of high load.   Also, the large NIMAGE products typically take around 45 -55 seconds.  I suspect this is due to having to push 20MB over a single connection, as opposed to pushing the same volume over multiple connections, as occurs at times with the CONDUIT stream without the higher latencies.  (This would also be true for LDM.)

I have been working on a comprehensive white paper describing the research results and features of INN and NNTP, which is now about 35 pages long.  The final section of the paper, which remains to be written, is "Recommendations".  However, I will make the general case here.

INN relays data at least as well as LDM.  With latencies for both generally quite small and with statistics being calculated differently for each, it is difficult to argue at this point that one is better in this regard than the other.

However, INN has additional features that the LDM does not have, which could be useful to us:
We are discussing how to proceed.  There is no urgent need to act as LDM6 is well positioned to handle data flow in the near future.  Steve Emmerson and I are researching the question of what an ideal data relay system would do and which technology would move us best in that direction.

With the NLDM network already in place, another approach would be to give the community a choice of LDM or NLDM.  After some improvement of the release engineering, with INN being robust and reliable it is possible that this could be accomplished with fairly minimal committment of resources in the short term.