[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #UYH-624598]: LVS realserver switching loses data



Art,

> Yes.  Here are the log entries:
> 
> Oct 23 15:25:52 iddrs2 ls2.meteo.psu.edu(feed)[8854] NOTE: Starting 
> Up(6.4.5/6): 20061023152504.577 TS_ENDT {{CONDUIT,  ".*"}}, Primary
> Oct 23 15:25:52 iddrs2 ls2.meteo.psu.edu(feed)[8854] NOTE: topo:  
> ls2.meteo.psu.edu {{CONDUIT, (.*)}}
> Oct 23 17:09:43 iddrs2 ls2.meteo.psu.edu(feed)[8854] NOTE: feed or notify 
> failure; HEREIS: RPC:Unable to send; errno = Broken pipe
> Oct 23 17:09:43 iddrs2 rpc.ldmd[12779] NOTE: child 8854 exited with status 7
> Oct 23 17:09:44 iddrs2 ls2.meteo.psu.edu(feed)[9486] NOTE: Starting 
> Up(6.4.5/6): 20061023170904.844 TS_ENDT {{CONDUIT,  ".*"}}, Primary
> Oct 23 17:09:44 iddrs2 ls2.meteo.psu.edu(feed)[9486] NOTE: topo:  
> ls2.meteo.psu.edu {{CONDUIT, (.*)}}

The format of the second "Starting Up" message makes me suspect that the LDM on 
ls2.meteo.psu.edu is earlier than version 6.4.0.  Those versions had a bug that 
might cause them to miss data during a reconnection under certain 
circumstances.  Can you upgrade the LDM on ls2.meteo.psu.edu?

> There were no log entries on the other two realservers,
> idd-ingest.meteo.psu.edu or iddrs3.meteo.psu.edu, at that time for ls2
> CONDUIT connections.
> 
> FYI, I ran a test this morning pulling CONDUIT data from ldm.meteo.psu.edu
> (our virtual LDM service) during which I halted the network on the test
> machine (at a time when our CONDUIT latencies were high) and restarted it
> to see what would happen... it worked perfectly, restarting with next
> product that came in.  Tomorrow I will try again with all our feeds coming
> onto the test system to see if that has any affect.
> 
> Art
> 
> Arthur A. Person
> Research Assistant, System Administrator
> Penn State Department of Meteorology
> email:  address@hidden, phone:  814-863-1563

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: UYH-624598
Department: Support LDM
Priority: Normal
Status: On Hold