[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19991021: What is an assertion failure?



>From: "Neil R. Smith" <address@hidden>
>Organization: Dept. Meteorology, TAMU
>Keywords: 199910211601.KAA01877

>I've been getting assertion failures:
>
>ldmd.log.2:Oct 20 19:54:40 3Q:coriolis pqsurf[5540]: assertion
>"pq->ctlp->magic == PQ_MAGIC" failed: file "pq.c", line 2515
>
>I've had these before and your advice was to remake the surf
>queue, as it had become corrupt.
>
>I'll remake again, but:
>
>1.  Just what is an assertion failure?
>
>2.  What usually is the cause?  
>
>3.  If the cause is a corrupt queue, what could cause the queue
>to become corrupt?  I've got a mounting number of 
>"surface_split: Can't handle MESSAGE_TYPE_UNKOWN" messages, 
>which at previous levels of accurence I assumed to be benign 
>relative to queue health.
>
>Thanks,  -Neil
>-- 
>Neil R. Smith                          address@hidden
>Computer Systems Manager               409/845-6272 FAX:409/862-4466
>Dept. Meteorology, Texas A&M Univ.
>

Neil,

An assertion failure for the product queue means that data in a certain
queue location is not as expected. Typically, this is because the
queue has been corrupted by the LDM not being cleanly shut down.

When the computer reboots while the LDM is running, data may be partially 
written to the queue. Since data is almost always coming in, if the LDM
is not shut down, then you can expect that writing to the data queue
was in progress. The product queue has header information for
a product and then the data. Usually an assertion failure means that
the header says data will be in a location, and either it is not,
or it is incomplete.

The most important thing to do is shutdown the LDM before rebooting.
Of course power failures and unexpected reboots cannot be avoided.

The unknown message type is not associated with the corrupt queue, but instead 
means 
that the products that are being passed to the pqsurf program are unrecognized.

Steve Chiswell