[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19990329: waldo's ldm at stc



>From: address@hidden
>Organization: St. Cloud State
>Keywords: 199903291714.KAA17553 LDM pqsurf ldmd.conf

Alan-

>I may be calling for help too soon here, but I really like
>the service you provide, so maybe you can fix us.

Never hesitate to call on us if you have exhausted your limits
of patience. ;-)

>The ldm has been set up and prior to configuration, it would 
>start and stop as expected.

Okay. That's a good start.

>After modifying the ldmd.conf file, and trying to set things so
>waldo's ldm will be fed data from hobbes, we are stalled.

Here's where problems arise.  The best thing to do is look at
the ldmd.log when you are having problems.  It is better to cat
or more the file instead of using ldmadmin log.  Sometimes you have
to look in ldmd.log.1 or ldmd.log.2, because each time you start
the ldm, the log gets rotated.

>We tried an ldmadmin start, got an error msg. so used ldmadmin stop
>
>1. After changing a couple of things, did an ldmadmin start again, message
>   returned is:  cannot start, there is already another server running.

When you run ldmadmin start, a lock file is created.  If the server
goes away but does not do so nicely, the lock file remains.  The easiest
thing is to run ldmadmin stop which will delete the lock file.

>   Have looked at processes for ldm, and all have gone away except for a 
>   pqact and pqexpire.  We don't understand why another server is running

In this case, the server died, but these processes did not.  The best
thing to do is kill the processes with the Unix kill command and start
again (which is what I did).

Once I got rid of these processes, I ran ldmadmin start and it just
kind of hung there for a while.  I looked at ldmd.log and saw the following:

Mar 29 21:06:58 waldo pqexpire[21986]: Starting Up
Mar 29 21:06:58 waldo pqbinstats[21987]: Starting Up (21985)
Mar 29 21:06:58 waldo pqact[21988]: Starting Up
Mar 29 21:06:58 waldo pqsurf[21989]: Starting Up (21985)
Mar 29 21:06:58 waldo pqsurf[21989]: pq_open failed: /usr/local/ldm/data/pqsurf.
pq: No such file or directory
Mar 29 21:06:58 waldo pqsurf[21989]: Exiting
Mar 29 21:06:58 waldo pqexpire[21986]: Exiting

I think this is your problem.  pqsurf is set up to be run in ldmd.conf:

exec   "pqsurf"

but no queue for this process was created.  Since you are not
running surf_split in your pqact.conf, there is no need to run
this process.  So I commented this out in ldmd.conf and then
ran ldmadmin start.    Now, when the LDM starts up, these messages
come up:

Mar 29 21:14:28 waldo hobbes[22046]: FEEDME(hobbes.stcloudstate.edu): 7: Access 
denied by remote server
Mar 29 21:14:58 waldo hobbes[22046]: run_requester: 19990329201458.990 TS_ENDT 
{{MCIDAS|IDS|DDPLUS,  ".*"}}
Mar 29 21:14:59 waldo hobbes[22046]: FEEDME(hobbes.stcloudstate.edu): 7: Access 
denied by remote server
Mar 29 21:15:29 waldo hobbes[22046]: run_requester: 19990329201529.012 TS_ENDT 
{{MCIDAS|IDS|DDPLUS,  ".*"}}
Mar 29 21:15:29 waldo hobbes[22046]: FEEDME(hobbes.stcloudstate.edu): 7: Access 
denied by remote server

which indicates that hobbes is not set up to feed waldo.  See below.

>2. I believe we have waldo correctly listed as a known host on hobbes, and
>   hobbes is a known host on waldo.  Can ping, ftp and telnet from waldo to
>   hobbes.  ldmping hobbes (run from waldo) gives an error of 
>
>   ...............   RPC: unable to receive   errno = connection reset by peer


I took a look at hobbes and see that you have an allow line in ldmd.conf
for waldo, so you should be able to feed.  However, when I looked at ldmd.pid
in ~ldm (the file that contains the process id for the ldm), it is dated Jan 4:

-rw-r--r--   1 ldm      data           6 Jan 04 15:05 ldmd.pid

To me, this indicates that you did not restart the ldm on hobbes after
changing the ldmd.conf file.  If you want changes in ldmd.conf to take
effect, you have to stop and restart the LDM.

>How about a fix?

I stopped the LDM on hobbes and restarted it.  However, while I was
doing this, you must have been editing ldmd.conf on waldo because
pqsurf was uncommented again.  So I re-commented it out again and
stopped and restarted the LDM on waldo. Now, you are getting data, but
you have some problems with your pqact.conf.  You must have copied over
the pqact.conf from hobbes, but you don't have any of the decoders
installed.   You will need to install the ldm-mcidas decoders on your
system.  For now, you might want to comment out the entries for the
ldm-mcidas or any other decoders you are running until you get them
installed.  It might be easier to start over with a fresh pqact.conf
and add entries for the decoders as you get them installed.

Don Murray