[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #QUM-298362]: primary/secondary



Karen,

> I'm seeing some errors in the ldm logs, the upstream shows, but not
> anything I don't expect with the "flapping":
> 
> May 01 16:29:20 benjy juno.protect.nssl(feed)[32271] NOTE: topo:
> juno.protect.nssl {{EXP, (.*)}}
> May 01 16:34:18 benjy juno.protect.nssl(feed)[32681] NOTE: Starting
> Up(6.8.1/6): 20120501163417.193 TS_ENDT {{EXP,  "NSE64"}},
> SIG=084f5966bb83c8feab7a658d86a0798b, Alternate
> May 01 16:34:18 benjy juno.protect.nssl(feed)[32681] NOTE: topo:
> juno.protect.nssl {{EXP, (.*)}}

I must admit, I do not understand why upstream LDM process 32681 would have the 
regular expression "NSE64" in its "Starting Up" message but the regular 
expression "(.*)"
in its "topo:" message.

> May 01 16:34:47 benjy juno.protect.nssl(feed)[31450] ERROR: Couldn't flush
> connection; nullproc_6() failure to juno.protect.nssl: RPC: Unable to
> receive; errno = Connection reset by peer
> May 01 16:37:12 benjy juno.protect.nssl(feed)[29672] NOTE: feed or notify
> failure; Error sending BLKDATA: RPC: Unable to send; errno = Broken pipe
> May 01 16:37:13 benjy juno.protect.nssl(feed)[524] NOTE: Starting
> Up(6.8.1/6): 20120501163712.607 TS_ENDT {{EXP,  "satellite/CloudCover/"}},
> SIG=263d47555344cb7fa906f5aebe885056, Primary
> May 01 16:37:13 benjy juno.protect.nssl(feed)[524] NOTE: topo:
> juno.protect.nssl {{EXP, (.*)}}

Hmm... Same regular expression disparity in the log messages from upstream LDM 
process 524.

> May 01 16:39:18 benjy juno.protect.nssl(feed)[32271] NOTE: feed or notify
> failure; COMINGSOON: RPC: Unable to receive; errno = Connection reset by
> peer
> May 01 16:39:18 benjy juno.protect.nssl(feed)[640] NOTE: Starting
> Up(6.8.1/6): 20120501163917.980 TS_ENDT {{EXP,
> "multi/MergedReflectivityQC/"}}, SIG=f2025453ac2c9147c15d2f604c996bb3,
> Primary
> 
> The downstream just shows the flapping between the 2 upstream servers:
> 
> May 01 16:39:17 juno 172.16.5.73[3376] NOTE: LDM-6 desired product-class:
> 20120501163917.908 TS_ENDT {{EXP,  "multi/MergedReflectivityQC/"},{NONE,
> "SIG=f2025453ac2c9147c15d2f604c996bb3"}}
> May 01 16:39:17 juno 172.16.5.73[3376] NOTE: Upstream LDM-6 on 172.16.5.73
> is willing to be an alternate feeder
> May 01 16:39:17 juno 172.16.5.74[3375] NOTE: Switching data-product
> transfer-mode to primary
> May 01 16:39:17 juno 172.16.5.74[3375] NOTE: LDM-6 desired product-class:
> 20120501163917.980 TS_ENDT {{EXP,  "multi/MergedReflectivityQC/"},{NONE,
> "SIG=f2025453ac2c9147c15d2f604c996bb3"}}
> May 01 16:39:17 juno 172.16.5.74[3375] NOTE: Upstream LDM-6 on 172.16.5.74
> is willing to be a primary feeder

The downstream LDM log messages are exactly what I would expect to see. The two 
downstream LDM-s switch very close to one another and specify the same 
signature for the last successfully-received data-product.

Are you certain that the two upstream LDM systems are receiving exactly the 
same data-products?

> When I had the downstream requesting from both upstreams, I run a notifyme
> on the downstream against all 3 queues and could see all data arriving in
> the 2 upstreams queues, but not arriving in the local queue.
> 
> There are roughly 31 levels (each a separate file) of data for a product
> that arrives every 5 minutes in a burst.  What I was getting was sometimes
> all 31 levels, but at other times only 1 level, sometimes more, like 5 or
> 9... but not all 31.  The regular expression matches all of the levels
> inclusively.
> 
> When I switched to request from only 1 upstream, I am now getting all of
> the data.
> 
> I'm attempting the "trick" now of requesting from both with slightly
> different patterns, so, I will see how that goes, but if I remember
> correctly that doubles the bandwidth used, as the downstream actually gets
> data from both upstreams, and then rejects it from the queue, right?  Not
> exactly an ideal option for me due to the amount of data I'm pushing
> around.

The REQUEST trick causes both connections to be in PRIMARY mode and, 
consequently, increases the bandwidth used to slightly less than double.

> --
> “Climate is what you expect, weather is what you get."
> -- Robert A. Heinlein
> 
> -------------------------------------------
> address@hidden
> 
> Phone:  405-325-6982
> Cell:   405-834-8559
> INDUS Corporation
> National Severe Storms Laboratory

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: QUM-298362
Department: Support LDM
Priority: Normal
Status: Closed