[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20001030: More on queue problem



Hi again, Tom.

We're developing a theory that we will be testing this afternoon.  On our 
machine we started seeing this problem when we replaced a RAID disk with a 
slower disk (due to a disk failure) and, coincidentally, turned on all the WSI 
data.  We had our pqact configured to write every WSI product to a file.  We 
think now that pqact was unable to keep up with the volume of products due to 
the I/O it needed to perform.  When pqact finally made it to the end of the 
queue it was "colliding" with one of the server processes that wanted to write 
to the queue, e.g., the server had a lock on that product while pqact wanted to 
process it, or perhaps vice versa.

If this is the case, it's not a bug in the LDM per se.  Rather, it's a problem 
of too much data for the hardware and software.  Have you increased your data 
stream lately?

If this is the case, there are a few things you can do to mitigate the problem. 
 If possible, you could put some decoders on other machines, leaving more CPU 
free for pqact.  Similarly, if you have other processes running on your 
machine, perhaps you could get rid of some of them.  Then there's always the 
old standby: upgrading the hardware or requesting less data.

You can use the 'iostat' command to report terminal, disk, and tape I/O 
activity, and CPU utilization.  The command 'iostat -D 5' will report on all 
disk activity every 5 seconds.  Perhaps this will reveal some more information 
for you.

I'll let you know the results of our testing later today.

Anne
-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************