[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20001002: pqsurf in ldm5.1.2



>To: address@hidden
>cc: address@hidden
>From: David Knight <address@hidden>
>Subject: pqsurf in ldm5.1.2
>Organization: UCAR/Unidata
>Keywords: 200010021701.e92H1Ub04645

Hi David,

> I'm finally getting around to installing ldm5.1.2
> 
> Everything seems to have gone OK, but, I can't seem
> to keep pqsurf running. After the ldm has been up for about a minute
> I get en entry like the following in the ldmd.log,
> and pqsurf dies and never comes back.
> 
> Oct 02 16:32:59 redwood pqsurf[15481]: child 15488 terminated by signal 10
> Oct 02 16:32:59 redwood pqsurf[15481]: Exiting
> Oct 02 16:32:59 redwood pqsurf[15481]:   Queue usage (bytes):   60704
> Oct 02 16:32:59 redwood pqsurf[15481]:            (nregions):     371
> Oct 02 16:32:59 redwood pqsurf[15481]: Number of products 89
> Oct 02 16:32:59 redwood pqsurf[15481]: Number of observations 385
> Oct 02 16:32:59 redwood pqsurf[15481]: Number of dups 7
> 
> I can start it up again manually with extra logging
> 
> pqsurf -xv -l - -d /data1 -q /ldmpq/ldm.pq -Q /ldmpq/pqsurf.pq
> 
> Again, it runs for a time, then stops by signal 10
> Here is the closest I can see to an error message in the
> more verbose logs
> 
>         surf: End of Queue
>         to 20001002155703.503
>         skip a6818ced261609cc1d3dd98c57b1e98b      104 20001002160439.236 
> IDS|DDPLUS 58001  metar PAGA 021600 RRC
>         max_latency 3158.941
>         diff 3614.674
>         heuristic depth break
> 
> 
> I recreated all product queues when I switched
> from using 5.0.8 to 5.1.2
> 
> I'm running Solaris 2.6
> 
> Any ideas, or debugging suggestions?

I haven't seen this before, though we don't regularly run pqsurf here.
We tested it and didn't see any problems however, and others are using
it successfully with 5.1.2.

After pqsurf crashes, I think you should explicitly recreate pqsurf.pq
again, because it's always possible this queue was left in an
inconsistent state since it was never closed.

When you create pqsurf.pq with pqcreate (or ldmadmin mksurfqueue?),
you may need to explicitly tell it to create more product slots (-S
numprods) in the pqsurf queue, since the computation for the number of
product slots has changed to now use a presumed average product size
of 4096 instead of the previous 2048.  Both of these may be too big
for the text products pqsurf is dealing with.  You can see how many
product slots have been allocated in pqsurf.pq and whether it ran out
by using

 pqmon -q /ldmpq/pqsurf.pq

and look under the columns labeled 

 nprods:        current number of products
 nfree:         number of free (allocated but unused) product slots
 nempty:        number of empty (unallocated) product slots

You can also run pqsurf independently from the LDM with error messages going
to standard output to see if that reveals what is happening.  It might
be useful to run pqmon in another window updating every second to see
if it shows anything unusual about what is happening in the pqsurf.pq
file.

Good luck ...

--Russ