[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20031202: LDM Latency Question



Chris,

>Date: Tue, 02 Dec 2003 14:10:36 -0600
>From: "Chris D Gilbert" <address@hidden>
>Organization: US DoC/NOAA/NWS/NEXRAD OSF
>To: address@hidden,
>To: address@hidden
>Subject: LDM Latency Question

The above message contained the following:

> Steve & Tom,
> 
> The Radar Operation Center (ROC) here in Norman heard about a possible
> latency problem when writing to the LDM queue. The problem was described
> as:
> 
> The LDM ingest task should be started from within LDM (ldmd.conf).  The
> reason is because the pq_insert() command sends a signal to LDM to
> notify of new products in the queue. Otherwise, LDM would have the new
> products in the queue, but only checks every 30 seconds unless otherwise
> notified. That can cause large latencies.

That's correct.  In general, the mean latency resulting from starting
data-product ingesters "outside" the LDM system will be 15 seconds
(one-half of the 30 second polling interval of a fallback strategy).

> Currently, our Build 5.0 BDDS code starts the ingest task outside of LDM
> using a script in the /etc/rc2.d directory.  Does this mean we have the
> 30 second latency problem?

Yes, I'm sorry to say.  Fortunately, the fix is very simple:

    1.  Don't start pqing_bdds(1) in the /etc/rc2.d/ldm script.

    2.  Add an "EXEC pqing_bdds" to the LDM configuration-file
        (ldmd.conf).

We did make some other changes to the LDM boot-time startup script
having to do with more efficiently checking the LDM product-queue and
recreating it, if necessary.  You might want to look at those changes as
well.

> Is there an easy way to test this?  Can we do a "ldmadmin watch" and see
> if it takes up to 30 seconds before we see the data written to the
> queue?

An "ldmadmin watch" starts-up the pqutil(1) utility running outside the
LDM process-group.  This utility has its own polling interval and so
wouldn't be a good way to investigate this issue.

A better way (and how we discovered the problem at one of the Central
Region's radar sites) would be to execute the notifyme(1) utility, e.g.,

    $ notifyme -vl- -h <<host>>

where <<host>> is the fully-qualified hostname or IP address of the
system running the data-product ingester.  This will cause an upstream
LDM to run on <<host>>.  This upstream LDM will only send data-product
metadata to the notifyme(1) process.  The notifyme(1) process will log
this metadata to the screen (-vl-).  If the log messages are grouped
approximately 30 seconds apart, then the problem exists.

> Any help will be appreciated,

Please feel free to contact us at any time for any reason.

> Chris Gilbert, ROC Software Engineer
> Phone: (405) 366-6520 Ext. 4246 / Fax (405) 366-6543
> E-mail: address@hidden

Regards,
Steve Emmerson
LDM Developer