[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20050928: LDM errors



Robert,

>Date: Thu, 29 Sep 2005 08:42:04 -0600
>From: Tom Yoksas <address@hidden>
>Organization: UCAR/Unidata
>To: "Vehorn, Robert CIV SPAWARSYSCEN Charleston SC J672" <address@hidden>
>Subject: 20050928: LDM errors 

The above message contained the following:

> >Another problem with the
> >configuration is that some sites are running behind very strict
> >firewalls, such that incoming LDM connections are not possible.

How strict are the firewall rules?  If they allow, for example, incoming
ssh or sendmail connections, then they should allow incoming LDM
connections as well because the LDM is at least as secure as sendmail.

> >These sites use 'pqsend' to push products downstream.

pqsend(1) is far less efficient in terms of throughput than the LDM
server because it uses a time-consuming synchronous protocol whereas 
the server uses a fast asynchronous protocol.

> >I have 2 such machines
> >at SPAWAR in Charleston, SC (SSCC), that need to send data to the
> >top-level server at the University of Wisconsin.  Here are the
> >applicable lines from my config:
> 
>   ## exec
>   exec "pqsend -h ice.ssec.wisc.edu -f EXP"
>   ## requests
>   REQUEST  EXP  ^USAP.(SSCC|NZCM) ice.ssec.wisc.edu PRIMARY
>   REQUEST  EXP  ^USAP.NCAR.GRIB.(D1|D2) ice.ssec.wisc.edu PRIMARY

You should quote the periods in the above by preceeding them
with a backslash (i.e., "\.") if the periods are actually in the
product-identifiers.

> >The server at UW has an 'accept' entry for us, and 'pqsend' is able to
> >connect initially.  The errors occur whenever the local LDM (SSCC)
> >tries to send any data to the server at UW.  Here is what the log looks
> >like (process 22153 is 'pqsend'):
> 
>   Sep 28 19:05:28 atslab-ldm rpc.ldmd[22150] NOTE: Starting Up (version: 
> 6.4.1; built: Aug  4 2005 22:47:06)
>   Sep 28 19:05:28 atslab-ldm rpc.ldmd[22150] NOTE: Using local address 
> 0.0.0.0:388
>   Sep 28 19:05:28 atslab-ldm pqact[22151] NOTE: Starting Up
>   Sep 28 19:05:28 atslab-ldm rtstats[22152] NOTE: Starting Up (22150)
>   Sep 28 19:05:28 atslab-ldm ice.ssec.wisc.edu[22153] NOTE: Starting Up 
> (22150)
>   Sep 28 19:05:28 atslab-ldm ice[22156] NOTE: Starting Up(6.4.1): 
> ice.ssec.wisc.edu:388 20050928180528.218 TS_ENDT {{EXP,  "^USAP.NZCM"}}
>   Sep 28 19:05:28 atslab-ldm ice[22156] NOTE: LDM-6 desired product-class: 
> 20050928180528.219 TS_ENDT {{EXP,  "^USAP.NZCM"}}
>   Sep 28 19:05:28 atslab-ldm ice[22154] NOTE: Starting Up(6.4.1): 
> ice.ssec.wisc.edu:388 20050928180528.266 TS_ENDT {{EXP,  
> "^USAP.NCAR.GRIB.(D1|D2)"}}
>   Sep 28 19:05:28 atslab-ldm ice[22154] NOTE: LDM-6 desired product-class: 
> 20050928180528.268 TS_ENDT {{EXP,  "^USAP.NCAR.GRIB.(D1|D2)"}}
>   Sep 28 19:05:35 atslab-ldm ice[22154] NOTE: Upstream LDM-6 on 
> ice.ssec.wisc.edu is willing to be a primary feeder
>   Sep 28 19:05:35 atslab-ldm ice[22156] NOTE: Upstream LDM-6 on 
> ice.ssec.wisc.edu is willing to be a primary feeder
>   Sep 28 19:05:37 atslab-ldm ice.ssec.wisc.edu[22153] ERROR: ship: RPC: 
> Remote system error:    15780 20050928182001.257     EXP 000  
> USAP.NCAR.GRIB.D1.2005092812.F018.002M.MIXR
>   Sep 28 19:06:09 atslab-ldm ice.ssec.wisc.edu[22153] ERROR: 
> sign_on(ice.ssec.wisc.edu): can't contact portmapper: RPC: Timed out
>   Sep 28 19:06:25 atslab-ldm ice.ssec.wisc.edu[22153] ERROR: ship: RPC: 
> Remote system error:
>     89 20050928181601.081     EXP 000  USAP.SSCC.AWS.Z601.20050928.1814
>   Sep 28 19:07:05 atslab-ldm ice.ssec.wisc.edu[22153] ERROR: 
> sign_on(ice.ssec.wisc.edu): can't contact portmapper: RPC: Timed out
>   
> >I've seen the 'sign_on' errors before and assumed they occured because
> >we were only sending data every 15 minutes and something had timed-out,
> >but the 'Remote system error' just started to occur after the servers
> >at UW were upgraded to version 6.4.1.  Note that the first product that
> >'pqsend' is trying to transfer is a grib file that was just received,
> >and the second product was produced locally.
> >
> >Can anyone shed any light on what is causing these errors?

I would expect this problem if UW were running LDM 6.4.0, which has a
bug in the way it handles data-products send via the HIYA mechanism.
Are you absolutely positive they're running version 6.4.1?

>   REQUEST  EXP  ^USAP.NCAR.GRIB.(D1|D2) ice.ssec.wisc.edu PRIMARY or
>   REQUEST  EXP  "USAP.NCAR.GRIB.(D1|D2).*" ice.ssec.wisc.edu PRIMARY
> 
> The first regular espression is best.  The second one's inclusion of a
> '.*' at the end is not needed as it is assumed.  Steve calls this type
> of regular expression pathological.

I, too, would use the first extended regular expression -- although I
would quote the periods if they're actually in the product-identifiers.
Pathological regular expressions have a ".*" prefix and are truely evil.
:-)

Regards,
Steve Emmerson