[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040708: LDM - Linux Red Hat 9 - LDM death caused by upstream LDM?



Tom,

> To: address@hidden
> From: "Tom Baltzer" <address@hidden>
> Subject: LDM - Linux Red Hat 9 - LDM death caused by upstream LDM?
> Organization: UCAR/Unidata
> Keywords: 200407081333.i68DXxBv002726 LDM

The above message contained the following:

> Institution: Unidata
> Package Version: 6.0.14
> Operating System: Linux Red Hat 9
> Hardware Information: Dual AMD 2 Ghz  w/2GB main memory
> Inquiry: Hey Steve,
> 
> While we were away, the LDM on lead1 up and died seemingly triggered
> by the upstream system (emo) - here is the log info:
> 
> Jul 02 07:09:29 lead1 pqact[4124]: pbuf_flush 10: time elapsed   3.183142 
> Jul 02 08:00:34 lead1 pqact[4124]: pbuf_flush 10: time elapsed   2.211820 
> Jul 02 09:57:07 lead1 pqact[4124]: pbuf_flush 8: time elapsed   4.056640 
> Jul 02 11:00:49 lead1 pqact[4124]: pbuf_flush 8: time elapsed   4.893119 
> Jul 02 13:28:00 lead1 emo[4130]: assertion "rlix != RL_NONE" failed: file 
> "pq.c", line 4092 
> Jul 02 13:28:05 lead1 emo[4129]: assertion "rlix != RL_NONE" failed: file 
> "pq.c", line 4092 
> Jul 02 13:28:12 lead1 rpc.ldmd[4121]: child 4129 terminated by signal 6 
> Jul 02 13:28:12 lead1 rpc.ldmd[4121]: Killing (SIGINT) process group 
> Jul 02 13:28:12 lead1 rpc.ldmd[4121]: SIGINT 
> Jul 02 13:28:12 lead1 pqact[4124]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4123]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4125]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4127]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4126]: Interrupt 
> Jul 02 13:28:12 lead1 pqact[4127]: Exiting 
> Jul 02 13:28:12 lead1 rtstats[4128]: Interrupt 
> Jul 02 13:28:12 lead1 eldm4[4131]: SIGINT 
> Jul 02 13:28:12 lead1 rtstats[4128]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4124]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4125]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4126]: Exiting 
> Jul 02 13:28:12 lead1 pqact[4123]: Exiting 
> Jul 02 13:28:14 lead1 rpc.ldmd[4121]: Terminating process group 
> Jul 02 13:28:14 lead1 rpc.ldmd[4121]: child 4130 terminated by signal 6 
> Jul 02 13:28:14 lead1 rpc.ldmd[4121]: Killing (SIGINT) process group 

Interesting.  I don't recall seeing this before.

> I did a queuecheck and it indicated that the queue was corrupt, so I
> saved it in case that might be useful.

Good.

> What do you think?

I don't know yet.

Would you please send me the output of the command "ldmadmin config".

> Thanks,
> Tom.

Regards,
Steve Emmerson
> NOTE: All email exchanges with Unidata User Support are recorded in the
> Unidata inquiry tracking system and then made publically available
> through the web.  If you do not want to have your interactions made
> available in this way, you must let us know in each email you send to us.