[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19990706: ldm -- Solaris 2.6 (fwd)




===============================================================================
Robb Kambic                                Unidata Program Center
Software Engineer III                      Univ. Corp for Atmospheric Research
address@hidden             WWW: http://www.unidata.ucar.edu/
===============================================================================

---------- Forwarded message ----------
Date: Thu, 8 Jul 1999 15:43:59 -0500 (CDT)
From: Carl Sinclair <address@hidden>
To: address@hidden
Subject: Re: 19990706: ldm -- Solaris 2.6


> X-Authentication-Warning: wcfields.unidata.ucar.edu: rkambic owned process 
doing -bs
> Date: Thu, 8 Jul 1999 09:00:21 -0600 (MDT)
> From: Robb Kambic <address@hidden>
> To: Carl Sinclair <address@hidden>
> cc: support-ldm <address@hidden>
> Subject: Re: 19990706: ldm -- Solaris 2.6
> MIME-Version: 1.0
> 
> Carl,
> 
> Could you give a brief description of the solution so I can enter it in
> the e-mail archives for future reference.
> 
> Thanks,
> Robb...
> 
> 
> On Thu, 8 Jul 1999, Carl Sinclair wrote:
> 
> > 
> > > X-Authentication-Warning: wcfields.unidata.ucar.edu: rkambic owned 
> > > process 
> > doing -bs
> > > Date: Wed, 7 Jul 1999 14:55:54 -0600 (MDT)
> > > From: Robb Kambic <address@hidden>
> > > To: Carl Sinclair <address@hidden>
> > > cc: support-ldm <address@hidden>
> > > Subject: Re: 19990706: ldm -- Solaris 2.6
> > > MIME-Version: 1.0
> > > 
> > > On Tue, 6 Jul 1999, Unidata Support wrote:
> > > 
> > > > >To: address@hidden
> > > > >cc: address@hidden
> > > > >From: Carl Sinclair <address@hidden>
> > > > >Subject: ldm -- Solaris 2.6
> > > > >Organization: .
> > > > >Keywords: 199907062032.OAA21474
> > > > 
> > > > We've come across a problem.....our ldm server machine no longer sends 
any 
> > data. 
> > > > We installed a patch on the client machines, but not on the server.  
After 
> > the 
> > > > patch was installed, the data stopped.  The patch was 105529-07: it 
replaces 
> > > > /kerned/drv/tcp  and fixes the "recursive mutex_enter" problem which 
caused 
> > a 
> > > > panic when /usr/ucb/shutdown was run.  The previous version was 
105529-05.  
> > I 
> > > > don't know if this is the cause of our data stopping.  Have you had any 
> > reported 
> > > > problems with this patch?  We're running out of ideas. Thanks.
> > > 
> > > Carl,
> > > 
> > > I'm not aware of any patch problems like you described above. You didn't
> > > mention what platform or OS. I'm assuming it's an HP. I would scan the
> > > ldmd.log file for an unusual messages as well as the system log files. ie
> > > /var/adm/messages or the log file for your system. There has to be some
> > > error message produced.  Also, you can put the ldm in verbose mode by
> > > sending it a USR2 signal. ie, kill -USR2 <ldmd.pid>  It's explained in
> > > detail in the man page "man ldmd"  This should at least give us some clues
> > > of the problem.  Also, did you try to send data to a downstream machine
> > > that didn't have the patch?
> > > 
> > > Robb...
> > > 
> > > 
> > > > 
> > > > 
> > > > Carl Sinclair
> > > > RIDDS Systems Analyst
> > > > NSSL -- Norman, OK
> > > > address@hidden
> > > > 
> > > 
> > > 
> > 
===============================================================================
> > > Robb Kambic                                  Unidata Program Center
> > > Software Engineer III                        Univ. Corp for Atmospheric 
Research
> > > address@hidden               WWW: http://www.unidata.ucar.edu/
> > > 
> > 
===============================================================================
> > > 
> > 
> > 
> > We actually found the problem....it's Solaris 2.6......it had nothing to do 
with 
> > the patch....thanks for your time.
> > 
> > 
> > Carl Sinclair
> > RIDDS Systems Analyst
> > NSSL -- Norman, OK
> > address@hidden
> > 
> 
> 
===============================================================================
> Robb Kambic                              Unidata Program Center
> Software Engineer III                    Univ. Corp for Atmospheric Research
> address@hidden                   WWW: http://www.unidata.ucar.edu/
> 
===============================================================================
> 

The problem was an internal problem with the way our data was being sent 
out....it's a UDP broadcast that is sent in 2 packets: 1450 and 998 bytes long. 
Our software was broadcasting the 1450 packet and then starting over with the 
998 packet as the first and 1450 as the second....this is what caused it to 
"lock up".  Problem solved.  Thanks again for your time.

Carl Sinclair
RIDDS Systems Analyst
NSSL -- Norman, OK
address@hidden