[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20031202: LDM Latency Question



Steve,

Using the notifyme tool, I was able to confirm that the log messages 
are grouped 30 seconds apart when I start my "pq_insert task" from 
the /etc/rc2.d script.

However, when I moved my "pq_insert task" into the ldmd.conf script 
(using the exec command), I still get the same results. The notifyme 
tool still shows the log messages 30 seconds apart. Do I need to do 
anything else? Is anyone available to talk to over the phone this week?

We are using LDM 6.0.14 on both our client and server. 

Thanks,
Chris Gilbert
LAB Phone (405) 366-6500 x2271



----- Original Message -----
From: Steve Emmerson <address@hidden>
Date: Tuesday, December 2, 2003 3:11 pm
Subject: 20031202: LDM Latency Question 

> Chris,
> 
> >Date: Tue, 02 Dec 2003 14:10:36 -0600
> >From: "Chris D Gilbert" <address@hidden>
> >Organization: US DoC/NOAA/NWS/NEXRAD OSF
> >To: address@hidden,
> >To: address@hidden
> >Subject: LDM Latency Question
> 
> The above message contained the following:
> 
> > Steve & Tom,
> > 
> > The Radar Operation Center (ROC) here in Norman heard about a 
> possible> latency problem when writing to the LDM queue. The 
> problem was described
> > as:
> > 
> > The LDM ingest task should be started from within LDM 
> (ldmd.conf).  The
> > reason is because the pq_insert() command sends a signal to LDM to
> > notify of new products in the queue. Otherwise, LDM would have 
> the new
> > products in the queue, but only checks every 30 seconds unless 
> otherwise> notified. That can cause large latencies.
> 
> That's correct.  In general, the mean latency resulting from starting
> data-product ingesters "outside" the LDM system will be 15 seconds
> (one-half of the 30 second polling interval of a fallback strategy).
> 
> > Currently, our Build 5.0 BDDS code starts the ingest task 
> outside of LDM
> > using a script in the /etc/rc2.d directory.  Does this mean we 
> have the
> > 30 second latency problem?
> 
> Yes, I'm sorry to say.  Fortunately, the fix is very simple:
> 
>    1.  Don't start pqing_bdds(1) in the /etc/rc2.d/ldm script.
> 
>    2.  Add an "EXEC pqing_bdds" to the LDM configuration-file
>       (ldmd.conf).
> 
> We did make some other changes to the LDM boot-time startup script
> having to do with more efficiently checking the LDM product-queue and
> recreating it, if necessary.  You might want to look at those 
> changes as
> well.
> 
> > Is there an easy way to test this?  Can we do a "ldmadmin watch" 
> and see
> > if it takes up to 30 seconds before we see the data written to the
> > queue?
> 
> An "ldmadmin watch" starts-up the pqutil(1) utility running 
> outside the
> LDM process-group.  This utility has its own polling interval and so
> wouldn't be a good way to investigate this issue.
> 
> A better way (and how we discovered the problem at one of the Central
> Region's radar sites) would be to execute the notifyme(1) utility, 
> e.g.,
>    $ notifyme -vl- -h <<host>>
> 
> where <<host>> is the fully-qualified hostname or IP address of the
> system running the data-product ingester.  This will cause an upstream
> LDM to run on <<host>>.  This upstream LDM will only send data-product
> metadata to the notifyme(1) process.  The notifyme(1) process will log
> this metadata to the screen (-vl-).  If the log messages are grouped
> approximately 30 seconds apart, then the problem exists.
> 
> > Any help will be appreciated,
> 
> Please feel free to contact us at any time for any reason.
> 
> > Chris Gilbert, ROC Software Engineer
> > Phone: (405) 366-6520 Ext. 4246 / Fax (405) 366-6543
> > E-mail: address@hidden
> 
> Regards,
> Steve Emmerson
> LDM Developer
> 

From address@hidden Mon Dec 29 13:15:16 2003
Received: from mail.osf.noaa.gov (mail.roc.noaa.gov [129.15.62.214])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id hBTKFGp2006118;
        Mon, 29 Dec 2003 13:15:16 -0700 (MST)
Organization: UCAR/Unidata
Keywords: 200312292015.hBTKFGp2006118
Received: from osf.noaa.gov ([127.0.0.1]) by mail.osf.noaa.gov
          (Netscape Messaging Server 4.15) with ESMTP id HQOAX800.H60;
          Mon, 29 Dec 2003 14:15:08 -0600 
From: "Chris D Gilbert" <address@hidden>
To: Chris D Gilbert <address@hidden>
Cc: Steve Emmerson <address@hidden>, address@hidden,
        address@hidden
Message-ID: <address@hidden>
Date: Mon, 29 Dec 2003 14:15:08 -0600
X-Mailer: Netscape Webmail
MIME-Version: 1.0
Content-Language: en
Subject: Re: 20031202: LDM Latency Question 
X-Accept-Language: en
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by unidata.ucar.edu id 
hBTKFGp2006118
X-Spam-Status: No, hits=0.5 required=5.0
        tests=AWL,FROM_AND_TO_SAME_6,NOSPAM_INC,QUOTED_EMAIL_TEXT,
              SPAM_PHRASE_00_01,X_ACCEPT_LANG
        version=2.43
X-Spam-Level: 

Steve,

I answered my own question.  My task was running as a daemon. So, it 
appears the LDM signal was not getting through. I modified my code to 
run as a regular task, and now notifyme shows no latencies. 

What signal does pq_insert use? It doesn?t say in the pqinsert man 
page. Where would I find this documented?


Chris Gilbert


----- Original Message -----
From: Chris D Gilbert <address@hidden>
Date: Monday, December 29, 2003 11:49 am
Subject: Re: 20031202: LDM Latency Question 

> Steve,
> 
> Using the notifyme tool, I was able to confirm that the log 
> messages 
> are grouped 30 seconds apart when I start my "pq_insert task" from 
> the /etc/rc2.d script.
> 
> However, when I moved my "pq_insert task" into the ldmd.conf 
> script 
> (using the exec command), I still get the same results. The 
> notifyme 
> tool still shows the log messages 30 seconds apart. Do I need to 
> do 
> anything else? Is anyone available to talk to over the phone this 
> week?
> We are using LDM 6.0.14 on both our client and server. 
> 
> Thanks,
> Chris Gilbert
> LAB Phone (405) 366-6500 x2271
> 
> 
> 
> ----- Original Message -----
> From: Steve Emmerson <address@hidden>
> Date: Tuesday, December 2, 2003 3:11 pm
> Subject: 20031202: LDM Latency Question 
> 
> > Chris,
> > 
> > >Date: Tue, 02 Dec 2003 14:10:36 -0600
> > >From: "Chris D Gilbert" <address@hidden>
> > >Organization: US DoC/NOAA/NWS/NEXRAD OSF
> > >To: address@hidden,
> > >To: address@hidden
> > >Subject: LDM Latency Question
> > 
> > The above message contained the following:
> > 
> > > Steve & Tom,
> > > 
> > > The Radar Operation Center (ROC) here in Norman heard about a 
> > possible> latency problem when writing to the LDM queue. The 
> > problem was described
> > > as:
> > > 
> > > The LDM ingest task should be started from within LDM 
> > (ldmd.conf).  The
> > > reason is because the pq_insert() command sends a signal to 
> LDM to
> > > notify of new products in the queue. Otherwise, LDM would have 
> > the new
> > > products in the queue, but only checks every 30 seconds unless 
> > otherwise> notified. That can cause large latencies.
> > 
> > That's correct.  In general, the mean latency resulting from 
> starting> data-product ingesters "outside" the LDM system will be 
> 15 seconds
> > (one-half of the 30 second polling interval of a fallback strategy).
> > 
> > > Currently, our Build 5.0 BDDS code starts the ingest task 
> > outside of LDM
> > > using a script in the /etc/rc2.d directory.  Does this mean we 
> > have the
> > > 30 second latency problem?
> > 
> > Yes, I'm sorry to say.  Fortunately, the fix is very simple:
> > 
> >    1.  Don't start pqing_bdds(1) in the /etc/rc2.d/ldm script.
> > 
> >    2.  Add an "EXEC pqing_bdds" to the LDM configuration-file
> >     (ldmd.conf).
> > 
> > We did make some other changes to the LDM boot-time startup script
> > having to do with more efficiently checking the LDM product-
> queue and
> > recreating it, if necessary.  You might want to look at those 
> > changes as
> > well.
> > 
> > > Is there an easy way to test this?  Can we do a "ldmadmin 
> watch" 
> > and see
> > > if it takes up to 30 seconds before we see the data written to the
> > > queue?
> > 
> > An "ldmadmin watch" starts-up the pqutil(1) utility running 
> > outside the
> > LDM process-group.  This utility has its own polling interval 
> and so
> > wouldn't be a good way to investigate this issue.
> > 
> > A better way (and how we discovered the problem at one of the 
> Central> Region's radar sites) would be to execute the notifyme(1) 
> utility, 
> > e.g.,
> >    $ notifyme -vl- -h <<host>>
> > 
> > where <<host>> is the fully-qualified hostname or IP address of the
> > system running the data-product ingester.  This will cause an 
> upstream> LDM to run on <<host>>.  This upstream LDM will only 
> send data-product
> > metadata to the notifyme(1) process.  The notifyme(1) process 
> will log
> > this metadata to the screen (-vl-).  If the log messages are grouped
> > approximately 30 seconds apart, then the problem exists.
> > 
> > > Any help will be appreciated,
> > 
> > Please feel free to contact us at any time for any reason.
> > 
> > > Chris Gilbert, ROC Software Engineer
> > > Phone: (405) 366-6520 Ext. 4246 / Fax (405) 366-6543
> > > E-mail: address@hidden
> > 
> > Regards,
> > Steve Emmerson
> > LDM Developer
> > 
>