[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040605: LDM changes on Rossby, latencies



>From: Mike Voss <address@hidden>
>Organization: SJSU
>Keywords: 200406052141.i55LfstK011909 IDD HDS latency

Mike,

>Regarding the changes you made on rossby. I did send a lengthy email a few
>days back, this is a different subject.

I got the other email, thanks.

>Since you made the changes to feed
>rossby CRAFT from OU (and to the queue size), rossby just hasn't been the
>same. For example, after
>running flawlessly for months, look at the HDS latencies. I'm not sure
>what's cuasing this, but since the change coincides with the changes you
>made on rossby, I'm looking there first. I changed the CRAFT feed back
>to thelma temporarily to see if that solves the problem. Let me know if
>you think of anything.

I looked at the latencies right after seeing your note yesterday
afternoon.  Since you switched the CRAFT feed, I decided to wait until
this morning to respond to see if there was any major change after the
feed was switched back to thelma.  After a reasonably quiet night, the
latencies seem to be heading back up (I am looking at the latencies
for 16Z), but it is too soon to tell if this is a trend that will
continue or not.

I will continue to watch rossby today to see if the latencies stay
better than they were for Friday/Saturday.  If they do, I will want to
switch back to feeding off of OU this evening to see if the latencies
climb.  If they do, it should mean that rossby is simply at the limits
of the number of ingest feeds it can process.  The only thing that is
different when feeding off of thelma instead of directly off of OU
(thelma feeds from OU) is the number of request lines in ldmd.conf.
For thelma, you had a single request; for OU, I setup the full six that
are required: 4 - one for each NWS regional HQ; 2 - backup servers.
During our stress testing of LDM-6 on thelma, we learned that the
number of ingest feeds is a strong determinant to the overall
performance of LDM.  The reason for this is that each ingest feed must
do a write lock on the queue to insert the product it just received.
Feeding downstream sites is much less of a factor since those rpc.ldmd
processes only have to do a read lock.  If it looks like the number
of ingest rpc.ldmd processes is causing problems, I suggest reorganizing
the data requests to cut the number.  As a first test of this, I
would collect IDS|DDPLUS, FNEXRAD, and UNIWISC into a single request.
This would free up two ingest processes.  Other than that, the only
thinkg I can think of doing is reducing the number of CONDUIT requests
from 5 to 3.

The only other possibility is that the 2 GB queue is simply too big for
the amount of memory on rossby.  I tried to use 'top' to check this
possibility, but I couldn't find it.  Is it installed somewhere not in
the PATH for 'ldm'?  If 'top' is not available, can it be installed?

More later...

Tom
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publically available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.