[ldm-users] LDM TCP error Problem Solved

Everyone-

        I posted long ago about our LDM continuously dropping connections and 
never was able to figure out the problem until the other day.   Here is an 
example of the errors we were receiving almost every minute, probably driving 
our upstream LDM admins crazy for the past two years...: 

Jan 07 05:00:06 twister pluto.met.fsu.edu[25362] NOTE: Upstream LDM-6 on 
pluto.met.fsu.edu is willing to be a primary feeder 
Jan 07 05:00:22 twister idd.unl.edu[25370] ERROR: readtcp(): select() timeout 
on socket 4 
Jan 07 05:00:22 twister idd.unl.edu[25370] ERROR: one_svc_run(): RPC layer 
closed connection 
Jan 07 05:00:22 twister idd.unl.edu[25370] ERROR: Disconnecting due to LDM 
failure; Connection to upstream LDM closed 

The problem was with our firewall, not the configuration of the LDM.   There is 
a tcp extension called "tcp receive window scaling" which allows the receiving 
end of a network connection to dynamically adjust the size of its receiving 
window (amount of data it can receive in one tcp frame).  It has been gradually 
implemented by various operating systems over the past few years but was not 
supported by the current kernel on ouur BSD firewall so when machines on our 
network implement it during a transfer the connection breaks.   With the 
installation of our new firewall -- and the tcp extension, we are no longer 
receiving timeout errors and dropped data in our LDM logs... 

Hope this helps someone down the road.

To watching the LDM in peace,

Phil Birnie
Department of Geography 
The Ohio State University
(614)519-6176



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the ldm-users archives: