[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20040920: Possible pqact issue in LDM?



Steven,

>Date: Mon, 20 Sep 2004 12:12:36 -0500
>From: "Steven Danz" <address@hidden>
>Organization: Aviation Weather Center
>To: Steve Emmerson <address@hidden>
>Subject: Re: 20040920: Possible pqact issue in LDM?
>Keywords: 200409091803.i89I3pnJ023109

The above message contained the following:

> The Nagios notification stuff doesn't flag an error until 10 minutes
> have passed, so I would guess it should have run by then.  By the time
> I notice the problem, get on line and grab the queue its usually 15
> minutes or so.  That, and when I dump the queue, notices before and
> after the missing one are listed.
> 
> (Is there a signal for pqact to re-open the log file?  I'd like to set
> it up with -v for a long period into a file, but I don't want to fill
> the disk... thought maybe there was a signal to close/reopen the log
> file)

The command "ldmadmin newlog" can be used at any time to start a new
logfile and remove logfiles that are too old (see "ldmadmin config").

Sending a SIGUSR2 to the pqact(1) process will cause it to rotate
through the logging levels in the order (NOTICE -> INFO -> DEBUG ->
NOTICE ...).

> Yes and yes.  I grabbed a copy of the queue on one of the periods when
> this happened over the weekend, and if I ran pqact -o <big_number> it
> picked up everything from the first pass and everything that it missed
> as well.

So, pqact(1) doesn't miss data-products if run manually on a saved
product-queue but does in the context of a executing LDM system.  Is
this true?

> Well, that pqcat you listed includes the product, which is GRIB, but
> just looking at the product ID strings they are just ASCII < 127 here.
> That and of course the fact that the second pass works fine.

Regards,
Steve Emmerson