[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20000117: LDM Data Problems/Aqua Outage?



On Wed, 19 Jan 2000, Karli Lopez (McIDAS) wrote:

> We've been having problems with the network lines for several months and 
> hopefully
> our admins should be working on that.  The problem is that we don't know when
> they'll be done upgrading/fixing and what's worse we have no control over 
> that.
> 
> I have noticed that after stopping LDM a couple of rpc.ldmd processes stay 
> running
> for about a minute or two before they finally die on their own.

Karli,

The running ldm processes are the data receivers trying to finish
receiving  the last product before dying. This is typical, not to worry
about this. But, before starting the ldm make sure these process are gone
or the start will be abnormal.  Abnormal LDM starts always have trouble,
rarely do they work correctly. 


  I might've been
> trying to run the new configuration at this point.  I don't know why this is 
> or if
> it's normal, I've only seen it happen since I've installed ldm5.0.9.  I tried
> reconnecting to pluto after making sure that the processes were inactive and 
> that
> the port was active but I still got an access denied error.
> 

That's because pluto doesn't have an allow line for breeze.uprm.edu in
their ldmd.conf file.  I'll contact FSU to have one added so you can
receive data from FSU.


> I've included the tracerts for pluto, sapodilla as well as aqua in three
> attachments.


It looks like FSU has the best connection of all three possibles.

Robb...

> 
> Thank You,
> 
> Karli Lopez
> 
> Robb Kambic wrote:
> 
> > On Mon, 17 Jan 2000, Unidata Support wrote:
> >
> > >
> > > ------- Forwarded Message
> > >
> > > >To: address@hidden
> > > >From: "Karli Lopez (McIDAS)" <address@hidden>
> > > >Subject: LDM Data Problems/Aqua Outage?
> > > >Organization: .
> > > >Keywords: 200001160730.AAA12839
> > >
> > > I've got LDM back up and running, but I'm only currently getting NLDN
> > > and WSI feeds.  For some reason I'm not able to properly log on to
> > > aqua.atmos.uah.edu (now our primary server), as you can see (Is anybody
> > > having the same problems with the server?):
> > > ---------------------------------------------------------------------
> > > breeze 27% cat ldmd.log
> > > Jan 16 04:34:36 5Q:breeze rpc.ldmd[9727]: Starting Up (built: Jan  9
> > > 2000 02:16:
> > > 05)
> > > Jan 16 04:34:36 5Q:breeze pqexpire[9583]: Starting Up
> > > Jan 16 04:34:36 5Q:breeze pqexpire[9583]: > Recycled   2423.362 kb/hr
> > > (   256.38
> > > 1 prods per hour)
> > > Jan 16 04:34:36 5Q:breeze pqact[9772]: Starting Up
> > > Jan 16 04:34:36 5Q:breeze pqbinstats[9629]: Starting Up (9727)
> > > Jan 16 04:34:36 5Q:breeze aqua[9693]: run_requester: Starting Up:
> > > aqua.atmos.uah
> > > .edu
> > > Jan 16 04:34:36 5Q:breeze striker[9681]: run_requester: Starting Up:
> > > striker.atm
> > > os.albany.edu
> > > Jan 16 04:34:36 5Q:breeze aqua[9693]: run_requester: 20000116033436.535
> > > TS_ENDT
> > > {{UNIDATA,  ".*"},{FSL2|MCIDAS,  ".*"}}
> > > Jan 16 04:34:36 5Q:breeze striker[9681]: run_requester:
> > > 20000116033436.579 TS_EN
> > > DT {{NLDN,  ".*"}}
> > > Jan 16 04:34:38 5Q:breeze localhost[9715]: Connection from localhost
> > > Jan 16 04:34:38 5Q:breeze localhost[9715]: Connection reset by peer
> > > Jan 16 04:34:38 5Q:breeze localhost[9715]: Exiting
> > > Jan 16 04:35:11 5Q:breeze sysu1[9779]: Connection from
> > > sysu1.uni.wsicorp.com
> > > Jan 16 04:35:11 5Q:breeze sysu1[9779]: hiya: 20000116042811.562 TS_ENDT
> > > {{WSI,
> > > ".*"}}
> > > Jan 16 04:35:11 3Q:breeze aqua[9693]: FEEDME(aqua.atmos.uah.edu): can't
> > > contact
> > > portmapper: Timed out
> > > Jan 16 04:35:36 3Q:breeze striker[9681]:
> > > FEEDME(striker.atmos.albany.edu): h_cln
> > > t_create(striker.atmos.albany.edu): Timed out while creating connection
> >
> > Karli,
> >
> > It seem your network connection to uah/striker is not good enough to get
> > the packets through. First it was the hostname lookup, but now it appears
> > that it's the network connection.  I would contact your sysadmin and your
> > ISP about this problem for a short term solution.  For the long term, let
> > try to get a connection from sapodilla.rsmas.miami.edu or
> > pluto.met.fsu.edu.  DO traceroutes to both and then send the results back
> > to me.
> >
> > >
> > > ---------------------------------------------------------------------
> > > So I tried going to what was supposed to be our backup server but it
> > > seems we have no access:
> > >
> > > ---------------------------------------------------------------------
> > >
> > > Jan 16 04:28:24 5Q:breeze rpc.ldmd[9518]: Starting Up (built: Jan  9
> > > 2000 02:16:
> > > 05)
> > > Jan 16 04:28:24 5Q:breeze pqbinstats[9560]: Starting Up (9518)
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: Starting Up
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: > Recycled   3658.179 kb/hr
> > > (   251.25
> > > 9 prods per hour)
> > > Jan 16 04:28:24 5Q:breeze pluto[7735]: run_requester: Starting Up:
> > > pluto.met.fsu
> > > .edu
> > > Jan 16 04:28:24 5Q:breeze striker[9542]: run_requester: Starting Up:
> > > striker.atm
> > > os.albany.edu
> > > Jan 16 04:28:24 3Q:breeze rpc.ldmd[9518]: bind: 388: Address already in
> >
> > This is caused by an abnormal ldm shutdown. All the ldm process have not
> > be killed off. I would do :
> >
> > % ps -eaf | grep ldm
> >
> > Make sure all the ldm processes are gone. Also do a:
> >
> > % rpcinfo -p
> >
> > and make sure port 388 is not in use before restart the ldm.
> >
> > Robb...
> >
> > > use
> > > Jan 16 04:28:24 5Q:breeze rpc.ldmd[9518]: Exiting
> > > Jan 16 04:28:24 5Q:breeze rpc.ldmd[9518]: Terminating process group
> > > Jan 16 04:28:24 5Q:breeze pqbinstats[9560]: Exiting
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: Exiting
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: > Up since:
> > > 20000116042824.791
> > > Jan 16 04:28:24 5Q:breeze rpc.ldmd[9518]: child 9504 terminated by
> > > signal 15
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: > Queue usage (bytes): 3047624
> > >
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: >          (nregions):     333
> > >
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: > nbytes recycle:        74544
> > >
> > > (  3658
> > > .179 kb/hr)
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: > nprods deleted:            5
> > >
> > > (   251
> > > .259 per hour)
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: > First deleted:
> > > 20000116032209.576
> > > Jan 16 04:28:24 5Q:breeze pqexpire[5958]: > Last  deleted:
> > > 20000116032321.215
> > > Jan 16 04:28:24 5Q:breeze rpc.ldmd[9518]: child 9354 terminated by
> > > signal 15
> > > Jan 16 04:28:24 5Q:breeze pluto[7735]: run_requester: 20000116032824.812
> > >
> > > TS_ENDT
> > >  {{UNIDATA,  ".*"},{FSL2|MCIDAS,  ".*"}}
> > > Jan 16 04:28:24 5Q:breeze striker[9542]: run_requester:
> > > 20000116032824.830 TS_EN
> > > DT {{NLDN,  ".*"}}
> > > Jan 16 04:28:26 3Q:breeze pluto[7735]: FEEDME(pluto.met.fsu.edu): 7:
> > > Access deni
> > > ed by remote server
> > > Jan 16 04:28:46 5Q:breeze aqua[29748]: Exiting
> > > Jan 16 04:28:56 5Q:breeze pluto[7735]: Exiting
> > > Jan 16 04:29:21 5Q:breeze striker[29118]: Exiting
> > > Jan 16 04:29:24 3Q:breeze striker[9542]:
> > > FEEDME(striker.atmos.albany.edu): h_cln
> > > t_create(striker.atmos.albany.edu): Timed out while creating connection
> > > Jan 16 04:29:54 5Q:breeze striker[9542]: Exiting
> > >
> > > ---------------------------------------------------------------------
> > >
> > > (Also striker.atmos.albany.edu, our NLDN server, is down for the time
> > > being).
> > > Any help in getting the feeds back up will be greatly appreciated.
> > >
> > > Karli Lopez
> > >
> > >
> > >
> > > ------- End of Forwarded Message
> > >
> >
> > ===============================================================================
> > Robb Kambic                                Unidata Program Center
> > Software Engineer III                      Univ. Corp for Atmospheric 
> > Research
> > address@hidden                   WWW: http://www.unidata.ucar.edu/
> > ===============================================================================
> 

===============================================================================
Robb Kambic                                Unidata Program Center
Software Engineer III                      Univ. Corp for Atmospheric Research
address@hidden             WWW: http://www.unidata.ucar.edu/
===============================================================================