Current NLDM Research: Using INN to Relay Data
Anne Wilson
February 18, 2003
In January of 2003 promising results were achieved relaying the CONDUIT data stream from Boulder to Washington, D.C. using the NNTP-based (Network News Transport Protocol) open source news server, INN (Internet News). The CONDUIT data stream is our our largest stream, consisting of approximately 25 gigabytes per day, roughly 75% of the entire volume of the IDD. In a test of data relay lasting just over 24 hours, over one half million products were relayed. All products were received, 99% of the products arrived within four seconds, and all products arrived within a 50 second maximum.
Unidata is currently relaying the CONDUIT stream to Washington, D.C. on a continuous basis. In addition to a direct path between Boulder and D.C., products are being relayed via Usenet. Statistics regarding current product count and latencies are generated and relayed back to Unidata every five minutes and are now available via the web. Also, a description of this project is available for the information of the Usenet community and others.
Like the LDM, INN allows local management of data via filing articles and piping them to other programs. Additionally, INN and NNTP support other features such as an automated routing scheme, a virtually unlimited number of hierarchically structured newsgroups for product categorization, cross posting to multiple groups, addition of nonstandard headers, plus flexible server management software.
So far, INN is being used without any modifications. Additional supporting software was developed to format and feed products from an LDM to a news server and to log relevant information about each product upon arrival. Also, an encoding scheme was developed for transmission compatibility. In the future, the message ID generation algorithm will need to be modified to support injection of the same product from multiple sites. Also, the article cutoff time unit, which is at the level of a day, must be made finer so that articles younger than one day can be rejected if necessary. No other software changes are anticipated.
Future plans are to enhance statistical reporting, ready the supporting Unidata software, then build a beta distribution. At that point beta sites will be solicited for the purpose of testing within a small network of sites. Robust compilation and display of statistics will be essential to testing for evaluation purposes. In particular, we will attempt to evaluate the benefit of relaying within our own private network against relaying via Usenet with its attendant difficulties.
For more details, including a side by side comparison of INN and LDM, please
see this presentation
to Guy Almes.