[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19991109: LDM memory not leaking leak on IRIX64



Tom,

You are not seeing a memory leak.
The LDM product queue is set on dataproc to be 250mb, and
the programs within the LDM product group all share a memory map
to that ~ldmdp/data/ldm.pq file.

When the LDM initially starts, very little of that memory map is 
in use. What you should be seeing is that the amount of the data queue
in use reaches the high water mark over time.

The ldm.pq data file has not increased in size as Robb Kambic might
have hypothesized. It appears to still be at 250MB.

Frequently bumping the LDM as you mention below would not be a good 
choice, and possibly detrimental if corruption of the memory mapped 
occurs. The LDM usage of the product queue is monitored by the pqexpire
program maintain the last hours worth of data in the product queue.
Although the programs each report they are using 250MB, it really isn't
as bad as that sounds, since they are all sharing the same memory map.
Most of the time, all the processes are utilizing the same page of
memory as well, so page faults are generally minimal. 

Steve Chiswell
Unidata User Support


>From: Tom Yoksas <address@hidden>
>Organization: .
>Keywords: 199911120843.BAA12135

>
>Robb et. al.,
>
>Tom Engel of SCD sent me this note on the 9th.  I just had today got to
>login and see it.  Can you respond to Tom?
>
>Tom
>
>------- Forwarded Message
>
>Return-Path: address@hidden
>Received: from ncar.UCAR.EDU (ncar.ucar.edu [192.52.106.6])
>       by unidata.ucar.edu (8.8.8/8.8.8) with ESMTP id HAA17504
>       for <address@hidden>; Tue, 9 Nov 1999 07:38:30 -0700 (MST)
>Organization: .
>Keywords: 199911091438.HAA17504
>Received: from niwot.scd.ucar.edu (niwot.scd.ucar.edu [128.117.8.223])
>        by ncar.UCAR.EDU (8.9.1a/) with ESMTP id HAA14176;
>        Tue, 9 Nov 1999 07:38:29 -0700 (MST)
>Received: (from engel@localhost)
>       by niwot.scd.ucar.edu (8.9.1a/8.9.1) id HAA02568;
>       Tue, 9 Nov 1999 07:38:29 -0700 (MST)
>Date: Tue, 9 Nov 1999 07:38:29 -0700 (MST)
>From: Tom Engel <address@hidden>
>Message-Id: <address@hidden>
>To: address@hidden, address@hidden
>Subject: LDM processes on dataproc
>
>Tom Yoksas, et.al.:
>
>It appears as if there's a relatively severe memory leak in the
>LDM processes ... specifically:
>
>Below are snapshots of the size of four LDM processes on dataproc
>at the beginning of each day over the last week.  As you can see,
>each of the processes "rpc.ldmd", "pqexpire", "pqact" and "pqbinstat"
>have doubled in size during the last week.
>
>Since dataproc is a production, interactive system, shared among
>numerous users, such unrestricted growth of daemons isn't good.
>
>A question:
>* Has UNIDATA seen such behavior in other systems?  If so, what
>  is UNIDATA's advice for dealing with this in an operational
>  setting?
>
>Two suggestions:
>1. the LDM system should be shut down and restarted on dataproc
>   in order to return the memory back to users
>2. UNIDATA should investigate and repair the memory leak.
>
>Thanks,
>Tom
>- ------------------------ Tom Engel --- Head, High Performance Systems Sectio
> n
>Scientific Computing Division   NCAR, P. O. Box 3000, Boulder, CO  80307-3000
>Phone 303-497-1270  Fax 303-497-1848  Pager 800-306-1988 (address@hidden)
>
>
>991103                                                          
>APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
>APS:   110208   254688 ?           33:51 ldmdp    rpc.ldmd                    
>   
>APS:   110016   254496 ?           22:07 ldmdp    pqexpire                    
>   
>APS:   109552   254560 ?           26:33 ldmdp    pqact                       
>   
>APS:   103392   254864 ?           28:30 ldmdp    pqbinstat                   
>   
>
>991104                                                          
>APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
>APS:   110352   254688 ?           42:16 ldmdp    rpc.ldmd                    
>   
>APS:   110160   254496 ?           25:54 ldmdp    pqexpire                    
>   
>APS:   109648   254560 ?           32:15 ldmdp    pqact                       
>   
>APS:   107584   254864 ?           33:45 ldmdp    pqbinstat                   
>   
>
>991105                                                          
>APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
>APS:   124976   254688 ?           59:13 ldmdp    rpc.ldmd                    
>   
>APS:   124784   254496 ?           34:53 ldmdp    pqexpire                    
>   
>APS:   123856   254560 ?           42:02 ldmdp    pqact                       
>   
>APS:   120656   254864 ?           42:03 ldmdp    pqbinstat                   
>   
>
>991106                                                          
>APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
>APS:   125008   254688 ?           72:07 ldmdp    rpc.ldmd                    
>   
>APS:   124784   254496 ?           41:24 ldmdp    pqexpire                    
>   
>APS:   123856   254560 ?           50:35 ldmdp    pqact                       
>   
>APS:   123632   254864 ?           49:03 ldmdp    pqbinstat                   
>   
>
>991107                                                          
>APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
>APS:   214992   254688 ?           87:11 ldmdp    rpc.ldmd                    
>   
>APS:   214768   254496 ?           49:05 ldmdp    pqexpire                    
>   
>APS:   213840   254560 ?           59:59 ldmdp    pqact                       
>   
>APS:   200240   254864 ?           57:03 ldmdp    pqbinstat                   
>   
>
>991108                                                          
>APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
>APS:   219376   254688 ?           93:27 ldmdp    rpc.ldmd                    
>   
>APS:   219152   254496 ?           53:15 ldmdp    pqexpire                    
>   
>APS:   218224   254560 ?           64:35 ldmdp    pqact                       
>   
>APS:   206016   254864 ?           61:25 ldmdp    pqbinstat                   
>   
>
>991109                                                          
>APS: RESM(kB) TOTM(kB) TTY_____ USERTIME USER____ COMMAND________             
>   
>APS:   248432   254688 ?          101:10 ldmdp    rpc.ldmd                    
>   
>APS:   248176   254496 ?           58:07 ldmdp    pqexpire                    
>   
>APS:   247248   254560 ?           70:12 ldmdp    pqact                       
>   
>APS:   236368   254864 ?           66:45 ldmdp    pqbinstat
>
>------- End of Forwarded Message
>