[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #IEY-365872]: Errors using LDM...



Hi Matthew,

> I am contacting you regarding an error that we keep running into
> using LDM. I do not recognize it and I'm not sure what can be done
> about it.
> 
> We are attempting to send some simple products (read: ascii text file
> or GIF or jpg files in these test cases) from our LDM system(s) at
> Wisconsin to our LDM systems at McMurdo Station, Antarctica.  The
> ldmadmin log files shows the following type of errors:
> 
> Jan 28 18:04:37 herbie amrc.ssec.wisc.edu[16669] NOTE: LDM-6 desired
> product-class: 20080128175307.157 TS_ENDT
> {{EXP,"USAP.AMRC.terminal_aero.*"},{NONE,
> "SIG=cd09242ed76f948c506f23d3178ce02f"}}
> Jan 28 18:04:37 herbie amrc.ssec.wisc.edu[16668] ERROR: readtcp():
> EOF on socket 4
> Jan 28 18:04:37 herbie amrc.ssec.wisc.edu[16668] ERROR: one_svc_run
> (): RPC layer closed connection
> Jan 28 18:04:37 herbie amrc.ssec.wisc.edu[16668] ERROR: Disconnecting
> due to LDM failure; Connection to upstream LDM closed

The above indicates that the downstream LDM process on host "herbie"
started-up and tried to connect to an upstream LDM process on host
"amrc.ssec.wisc.edu".  Unfortunately, the TCP socket connection
returned an end-of-file condition to the downstream LDM process, so
the connection failed.

The reason for the closing of the connection *might* be found in the
LDM log file on the upstream host (amrc.ssec.wisc.edu).  If no reason
is found, then something outside the two LDM processes caused the
connection to close (we've seen this phenomena before).  Candidates
include operating-system bugs (particularly the TCP/IP stack), firewall
rules, and misbehaving network routers.

Fortunately, the LDM takes such occurrences in stride and will continue
to try to connect.  As long as the outage doesn't last so long that
the data-product in the queue that should be sent first is overwritten,
then no data will be lost.

Cross your fingers and check the log file on amrc.ssec.wisc.edu for 
messages near the time in question.  If nothing's found, then consider
the route that packets take and what might close the connection.

> So these errors are from the LDM log file of the requesting system in
> Antarctica (herbie.usap.gov).  The source of the data/product (in
> this example, the product is USAP.AMRC.termina_aero.$day.$time.txt)
> is coming off of the LDM running on amrc.ssec.wisc.edu  We have also
> seen this when herbie.usap.gov feed off of ice.ssec.wisc.edu -
> another LDM system we have back here at Wisconsin.  It doesn't appear
> to be doing this consistently, but we've seen it occur off and on
> over the last few days as we increase requests for products on
> herbie.usap.gov  Any advice or suggestions welcome.
> 
> Thank you!
> 
> Matthew
> 
> ------------------------------------------------------------------------
> Matthew Lazzara -Meteorologist- Antarctic Meteorological Research Center
> 947 Atmospheric, Oceanic and Space Sciences    http://amrc.ssec.wisc.edu
> Space Science and Engineering Center         E-mail: address@hidden
> University of Wisconsin-Madison                    Phone: (608) 262-0436
> 1225 West Dayton Street, Madison, WI 53706           Fax: (608) 263-6738
> ------------------------------------------------------------------------
Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: IEY-365872
Department: Support LDM
Priority: Normal
Status: Closed