[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040611: LDM 6.0.15 crashes again...



Gilbert,

> To: General Support <address@hidden>
> From: Gilbert Sebenste <address@hidden>
> Subject: LDM 6.0.15 crashes again...
> Organization: NIU
> Keywords: 200405310600.i4V60mtK022724 LDM

The above message contained the following:

> Another sigbus whatever it is...on weather3.admin.niu.edu...

A SIGBUS means that an attempt was made to access an undefined portion
of a memory object.  This could be due to a bug in the program or, I'm
sorry to say, a bug in the operating system.

> Gilbert
> 
> ******************************************************************************
> Gilbert Sebenste                                                     ********
> (My opinions only!)                                                  ******
> Staff Meteorologist, Northern Illinois University                      ****
> E-mail: address@hidden                                               ***
> web: http://weather.admin.niu.edu                                      **
> Work phone: 815-753-5492                                                *
> ******************************************************************************
> 
> ---------- Forwarded message ----------
> Date: Thu, 10 Jun 2004 16:58:47 -0500 (CDT)
> From: root <address@hidden>
> To: address@hidden
> 
> Jun 10 21:40:41 weather3 weather2(feed)[14768]: topo:  
> weather2.admin.niu.edu NLDN
> Jun 10 21:42:35 weather3 pqact[7130]: pbuf_flush 4: time elapsed   
> 6.537632
> Jun 10 21:42:39 weather3 pqact[7130]: pbuf_flush 4: time elapsed   
> 2.653611
> Jun 10 21:42:52 weather3 pqact[7130]: pbuf_flush 32: time elapsed   
> 9.039444

The last message above means that it took the pqact(1) process over 9
seconds to write a block of data to a UNIX pipe.  This is a rather long
time and means that either the process reading the pipe is slow or the
machine is overloaded.

> Jun 10 21:46:35 weather3 pqact[7130]: pbuf_flush 5: time elapsed   
> 2.109481
> Jun 10 21:49:28 weather3 rpc.ldmd[7126]: child 7135 terminated by signal 7
> Jun 10 21:49:28 weather3 rpc.ldmd[7126]: Killing (SIGINT) process group
...

Naturally, the 6.0.15 LDM we're running on host Rodney (which is very
similar to Weather3) isn't having any problems.

By any chance, was this a debug version of the LDM and is there a core
file?

Regards,
Steve Emmerson
> NOTE: All email exchanges with Unidata User Support are recorded in the
> Unidata inquiry tracking system and then made publically available
> through the web.  If you do not want to have your interactions made
> available in this way, you must let us know in each email you send to us.