[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: pqact not starting



"Brian S. Miller" wrote:
> 
> Anne-
> I cant stop the LDM server.
> I run ldmadmin stop then ldmadmin delqueue and it tells me that the server
> is still running.
> If I try to kill the process, it does not kill.  What is going on?  Do I
> need to do something else?
> 
> Thanks.
> Brian
> 
Hi Brian,

First, try 'ldmadmin stop' at least two times.  The ldm uses two "flag"
files: ~ldm/ldmd.pid and /tmp/.ldmadmin.lck.  If these files are
inappropriately in place, like when the ldm should be stopped, then
ldmadmin won't start the ldm.  But, a sequence of calls to 'ldmadmin
stop' will ultimately clean up these files.  Another option to try is
'ldmadmin clean', which also cleans up those files.  If this works you
can start over from deleting and remaking the queue.

But, if 'ldmadmin stop' won't work and rpc.ldmd processes are still
hanging around, then you may have to kill them individually.  If an
rpc.ldmd is writing or transmitting a large product when it gets a
terminate signal it will stop only after it's completed it's job, which
may take up to a few minutes in an extreme case.  But,  if you've waited
a "sufficiently long time", and it's still not dead, then it's wedged. 
If that's the case you'll need to murder by hand the processes it
spawned.  You'll have to get the process ID, i.e., the PID, from the
'ps' command.  Do 'ps -ef | grep ldm'.  Each line will show you, among
other things, two process ID numbers.  The first is the PID of that
process, and the second is the PID of it's parent process.  To kill the
processes do 'kill -9 <PID1> <PID2> ...', where PIDx is the PID of the
proces you want to kill.  You'll want to kill all the rpc.ldmd
processes.  If there's still a pqact running, kill that too.  Same for
pqbinstats.

After this massacre you'll definately need to rebuild your queue, as the
processes were terminated in such a way that there were unlikely to
terminated gracefully and thus the queue is likely to be corrupted.

Let me know how it goes.  

Anne

-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************