[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #UAK-912261]: data flow problems



Hi Karen,

> I'm having an interesting problem with a couple of my machines.  I have
> "pluto which gets it's data over a private connection that seems to be
> working fine.  It should feed the data to dontpanic but that doesn't
> seem to be working.  It was fine last week, and even this morning part
> of the data was getting through, but not all... same feed type,
> different patterns, but only 1 feed was getting through even though
> dontpanic only has one request line to pluto.
> 
> Now I've restart ldm on both machines and even stopped ldm rebuilt
> queues and restarted on both machines, and I have no data flowing now.
> I even rebooted the upstream machine.  I haven't rebooted the downstream
> machine yet,  I can but I hope not to as it is my primary data server
> for a number of realtime systems.
> 
> Pings and ldmpings work between the machines, I'm actually concerned
> that on the upstream machine I can see the data coming in using ldmadmin
> watch, but when I try to run notifyme against the local queue using this
> command:
> 
> notifyme -v -l - -h localhost
> 
> I don't get any notifications of the data arriving in the queue.  I have
> a feeling this is why the data isn't getting downstream.  I am seeing
> this in the log:
> 
> May 27 19:39:45 pluto localhost(noti)[12784]: Starting Up(6.0.14/5):
> 20080527193945.809 TS_ENDT {{ANY,  ".*"}}
> May 27 19:39:45 pluto localhost(noti)[12784]: topo:
> localhost.localdomain ANY
> May 27 19:43:18 pluto localhost(noti)[12528]:
> nullproc5(localhost.localdomain): RPC: Unable to receive

The first two log messages show normal startup of an upstream LDM
process in response to a notifyme(1) process.  The third log message
is from a different upstream LDM process (different PID).

Were there any other log messages from PID 12784?

Try running two xterm(1) windows.  In one, run "ldmadmin watch"; in
the other, run notifyme(1).  They should show the same data-products,
although the notifyme(1) might lag the "ldmadmin watch".  If there's
a discrepancy, then find the PID of the upstream LDM that was started
in response to the notifyme(1), fgrep(1) just its log messages e.g.,
"fgrep '[nnnnn]' $HOME/logs/ldmd.log"), and send them to me.

> Same kind of response when I try from the downstream machine.
> 
> This was working last week, but as a sanity check I rebuilt my ldm from
> source and checked all the configurations.  It is a slightly older
> version 6.0.14.   I even double checked to make sure iptables and se
> linux weren't running.   There is no firewall between the machines as
> they are both on our internal network.  I also checked to make sure the
> rpc/services files still had the proper settings.
> 
> I have exhausted all my ideas, looking for any ideas of what to try
> next.  I'd rather exhaust all my options on the upstream machine
> (especially as it seems that is where the problem is -- considering the
> notifyme failures) before trying anything on the downstream machine.
> 
> --
> -------------------------------------------
> 
> There are 2 kinds of people in the world:
> 
> 1) Those who can extrapolate from incomplete data.
> 
> -------------------------------------------
> address@hidden
> 
> Phone:  405-325-6982
> Cell: 405-834-8559
> SAIC/Systems Analyst
> National Severe Storms Laboratory

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: UAK-912261
Department: Support LDM
Priority: Normal
Status: On Hold