[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19991109: LDM memory leak on IRIX64



On Fri, 12 Nov 1999, Tom Yoksas wrote:

> 
> Robb et. al.,
> 
> Tom Engel of SCD sent me this note on the 9th.  I just had today got to
> login and see it.  Can you respond to Tom?
> 
> Tom
> 
> ------- Forwarded Message
> 
> Return-Path: address@hidden
> Received: from ncar.UCAR.EDU (ncar.ucar.edu [192.52.106.6])
>       by unidata.ucar.edu (8.8.8/8.8.8) with ESMTP id HAA17504
>       for <address@hidden>; Tue, 9 Nov 1999 07:38:30 -0700 (MST)
> Organization: .
> Keywords: 199911091438.HAA17504
> Received: from niwot.scd.ucar.edu (niwot.scd.ucar.edu [128.117.8.223])
>         by ncar.UCAR.EDU (8.9.1a/) with ESMTP id HAA14176;
>         Tue, 9 Nov 1999 07:38:29 -0700 (MST)
> Received: (from engel@localhost)
>       by niwot.scd.ucar.edu (8.9.1a/8.9.1) id HAA02568;
>       Tue, 9 Nov 1999 07:38:29 -0700 (MST)
> Date: Tue, 9 Nov 1999 07:38:29 -0700 (MST)
> From: Tom Engel <address@hidden>
> Message-Id: <address@hidden>
> To: address@hidden, address@hidden
> Subject: LDM processes on dataproc
> 
> Tom Yoksas, et.al.:
> 
> It appears as if there's a relatively severe memory leak in the
> LDM processes ... specifically:
> 
> Below are snapshots of the size of four LDM processes on dataproc
> at the beginning of each day over the last week.  As you can see,
> each of the processes "rpc.ldmd", "pqexpire", "pqact" and "pqbinstat"
> have doubled in size during the last week.
> 
Tom,

This is not a memory leak, here's the scoop. IRIX platforms permit the LDM
queue to grow as needed to accommodate the increase size of the
datastreams received. The LDM queue is a memory mapped file that is
accessed by the above mentioned programs, so ps reports it as part of the
size of each of these files. So as the LDM queue grows, then each of the
above programs sizes increases also.  The fix is to stop the LDM and
increase the size of the LDM queue to accommodate the current data
streams. I'll let Tom Yoksas take care of this problem when he returns
from down under as this is not a critical problem.

Robb...


> Since dataproc is a production, interactive system, shared among
> numerous users, such unrestricted growth of daemons isn't good.
> 
> A question:
> * Has UNIDATA seen such behavior in other systems?  If so, what
>   is UNIDATA's advice for dealing with this in an operational
>   setting?
> 
> Two suggestions:
> 1. the LDM system should be shut down and restarted on dataproc
>    in order to return the memory back to users
> 2. UNIDATA should investigate and repair the memory leak.
> 
> Thanks,
> Tom
> - ------------------------ Tom Engel --- Head, High Performance Systems 
> Section
> Scientific Computing Division   NCAR, P. O. Box 3000, Boulder, CO  80307-3000
> Phone 303-497-1270  Fax 303-497-1848  Pager 800-306-1988 (address@hidden)
> 
> 
> 991103                                                          
> APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
> APS:   110208   254688 ?           33:51 ldmdp    rpc.ldmd                    
>   
> APS:   110016   254496 ?           22:07 ldmdp    pqexpire                    
>   
> APS:   109552   254560 ?           26:33 ldmdp    pqact                       
>   
> APS:   103392   254864 ?           28:30 ldmdp    pqbinstat                   
>   
> 
> 991104                                                          
> APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
> APS:   110352   254688 ?           42:16 ldmdp    rpc.ldmd                    
>   
> APS:   110160   254496 ?           25:54 ldmdp    pqexpire                    
>   
> APS:   109648   254560 ?           32:15 ldmdp    pqact                       
>   
> APS:   107584   254864 ?           33:45 ldmdp    pqbinstat                   
>   
> 
> 991105                                                          
> APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
> APS:   124976   254688 ?           59:13 ldmdp    rpc.ldmd                    
>   
> APS:   124784   254496 ?           34:53 ldmdp    pqexpire                    
>   
> APS:   123856   254560 ?           42:02 ldmdp    pqact                       
>   
> APS:   120656   254864 ?           42:03 ldmdp    pqbinstat                   
>   
> 
> 991106                                                          
> APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
> APS:   125008   254688 ?           72:07 ldmdp    rpc.ldmd                    
>   
> APS:   124784   254496 ?           41:24 ldmdp    pqexpire                    
>   
> APS:   123856   254560 ?           50:35 ldmdp    pqact                       
>   
> APS:   123632   254864 ?           49:03 ldmdp    pqbinstat                   
>   
> 
> 991107                                                          
> APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
> APS:   214992   254688 ?           87:11 ldmdp    rpc.ldmd                    
>   
> APS:   214768   254496 ?           49:05 ldmdp    pqexpire                    
>   
> APS:   213840   254560 ?           59:59 ldmdp    pqact                       
>   
> APS:   200240   254864 ?           57:03 ldmdp    pqbinstat                   
>   
> 
> 991108                                                          
> APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
> APS:   219376   254688 ?           93:27 ldmdp    rpc.ldmd                    
>   
> APS:   219152   254496 ?           53:15 ldmdp    pqexpire                    
>   
> APS:   218224   254560 ?           64:35 ldmdp    pqact                       
>   
> APS:   206016   254864 ?           61:25 ldmdp    pqbinstat                   
>   
> 
> 991109                                                          
> APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
> APS:   248432   254688 ?          101:10 ldmdp    rpc.ldmd                    
>   
> APS:   248176   254496 ?           58:07 ldmdp    pqexpire                    
>   
> APS:   247248   254560 ?           70:12 ldmdp    pqact                       
>   
> APS:   236368   254864 ?           66:45 ldmdp    pqbinstat
> 
> ------- End of Forwarded Message
> 

===============================================================================
Robb Kambic                                Unidata Program Center
Software Engineer III                      Univ. Corp for Atmospheric Research
address@hidden             WWW: http://www.unidata.ucar.edu/
===============================================================================