[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19990322: ldm problem



On Mon, 22 Mar 1999, Jim Hines   (awdnsun)  472-6708 wrote:

> Robb
> 
> I think I still have a problem...
> You were right I ran out of disk space, I store
> the complete feed, it usually runs about 80,000,000
> but sometimes Friday it started getting better, I don't
> think anything was changed on this end.
> 
> 
> /data/ldm/zephyr/ARCHIVES> ls -l
> total 2055408
> -rw-r--r--   1 ldm      ldmgrp   78783582 Mar 15 17:59 wxfiles.990315
> -rw-r--r--   1 ldm      ldmgrp   75313866 Mar 16 17:59 wxfiles.990316
> -rw-r--r--   1 ldm      ldmgrp   79378339 Mar 17 18:00 wxfiles.990317
> -rw-r--r--   1 ldm      ldmgrp   82350614 Mar 18 17:59 wxfiles.990318
> -rw-r--r--   1 ldm      ldmgrp   129895221 Mar 19 18:50 wxfiles.990319
> -rw-r--r--   1 ldm      ldmgrp   302479362 Mar 20 18:50 wxfiles.990320
> -rw-r--r--   1 ldm      ldmgrp   217571328 Mar 21 15:49 wxfiles.990321
> -rw-r--r--   1 ldm      ldmgrp   85962543 Mar 22 12:47 wxfiles.990322
> /data/ldm/zephyr/ARCHIVES>
> 
> You can see how the files got big.....
> 
> also I got this email...
> > 
> > >From ldm Fri Mar 19 12:54 CST 1999
> > Date: Fri, 19 Mar 1999 12:54:30 -0600
> > From: ldm (Unidata LDM)
> > Subject: Local LDM is down - stop/start failed
> > 
> > ldmfail: Mar 19 18:54:30 UTC
> > 
> > LDM status report from the logs for the last 24 hours.
> > 
> > Currently hpccsun is running 43 percent idle
> > load average: 1.51, 0.64, 0.34
> > Running version number 5.0.
> > LDM was restarted 1 time(s)
> >     Last LDM restart at Mar 19 18:50:09
> > Max Queue usage is 25001984 bytes, it occurred at Mar 19 18:50:05
> > 
> > Critical LDM problems that need immediate attention:
> > 
> > Potential LDM Problems:
> > 
> > Decoder LDM Problems:
> > 
> > 
> > 
> 
>  I don't understand what the Critical LDM problem is????

Jim, 

This script is becoming outdated because the log messages have changed so 
much, so don't worry about the error messages now.


> 
> My guess is that when I got the Critical LDM problem
> my files started growing faster!!!!
> 
> 
> also now when I stop and start the ldm I get.....
> 
> /usr/local/ldm> ldmadmin stop
> stopping the LDM server...
> LDM server stopped
> /usr/local/ldm> ldmadmin start
> starting the LDM server...
> Mar 22 19:00:17 UTC hpccsun.unl.edu : stop_ldm: Server not started or 
> registered after 61 seconds
> /usr/local/ldm>
> 
> Why am I getting Server not started or registered????
> the server is running because my files are growing....



This will help the LDM start, it's a HP security problem.  Change
check_registered in bin/ldmadmin  from :

sub check_registered {

    $rpcinfo_cmd = "rpcinfo -t localhost 300029";
    `$rpcinfo_cmd 5 > /dev/null 2>&1`;
    if($?) {
        `$rpcinfo_cmd 4 > /dev/null 2>&1`;
        if($?) {
             return 1;
        }
    }
    return 0;
}


to

sub check_registered {

    $rpcinfo_cmd = "rpcinfo -p | grep  300029";
    `$rpcinfo_cmd  > /dev/null 2>&1`;
    if($?) {
             return 1;
    }
    return 0;
}

Also since your disk became full it's possible that you ldm queue is
corrupted.  I would ldmadmin stop/delqueue/mkqueue/start just to make sure
it's ok.  One can check if data is arriving by ldmadmin watch.


Robb...


> 
> 
> Thanks again
> Jim Hines 
> 

===============================================================================
Robb Kambic                                Unidata Program Center
Software Engineer III                      Univ. Corp for Atmospheric Research
address@hidden             WWW: http://www.unidata.ucar.edu/
===============================================================================