[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20011008: IDD latencies at PSU (cont.)



"Arthur A. Person" wrote:
> 
> 
> I switched to sunshine.ssec.wisc.edu last evening and things did not
> improve.  Of real interest, though, is that when I switched, it appears
> that our NEXRAD feed tanked.  To me, this means some sort of configuration
> issue on our ldm machine.  It would seem that too many products are coming
> in and and being relayed to downstream sites from one source.
> Previously, NEXRAD was the only data received from sunshine and it only
> gets relayed to a couple of machines.  That data has always been on time.
> Our motherlode received data is currently getting relayed to ~13 machines.
> When I switched the motherlode stuff to sunshine, we started suffering the
> slowness on the NEXRAD sunshine data as well because sunshine data of any
> kind (DDPLUS, HDS, etc...) was then being fed to ~13 machines.  This is my
> theory.  I'll leave its explanation or debunking to you folks.  I suppose
> you could still look at the network, but I believe the NEXRAD data is of
> sufficient volume that we should see a problem with it too if there was a
> networking problem.
> 
> > Or, please do a traceroute to all your upstream sites and send them to
> > us.  We'll probably present them to our network administrators.
> 
> I have a couple of logs going which you can look at:
> 
>      ~ldm/logs/all_stats* -> contain twice daily cat's of *.stats files
>      /var/log/trace*  -> contain traceroute logs to various sites
> 
> I will be out of the office part of today but will check my email later in
> the afternoon.  Let me know if you need anything else from me.
> 
>                                 Thanks.
> 
>                                   Art.
> 


Hi Art,

Mike Schmidt, our system administrator, and I logged on to ldm.meteo
today.  Everything seemed fine.  The network interface seemed fine - no
incoming or outgoing transmission failures.  The machine load seems
great.  The memory usage is good - very little swap space being used. 
You probably know all this already.

I'm wondering about whether your data volume has increased and if that
could be having an impact on your connectivity.  I see that you are
requesting CONDUIT data, the NMC2 feed, from motherlode.  That is a
large feed that has increased in volume over the past two weeks because
the reliability of getting products onto the feed has increased.  And,
you are propagating that around PSU.  For the sake of experiment, you
might try (1) not propagating that feed, then (2) turning that off that
request.  I know you hypothesized about outbound volume affecting
inbound latencies.   I just don't see how that could occur, but I
suppose it's possible, so might as well test it. 

We are also wondering about your local campus network, and have several
questions for your network engineers.  The first obvious question is,
was anything changed three days ago when this started?  We're really
wondering about the effect of a firewall.  And, I also really wonder
about the double hop, hops 11 and 12, at ldm.meteo in the traceroutes
executed from here.  We've never seen that before.  And, what is hop
10?  Here's a sample:

(anne) imogene:/home/anne 96 % traceroute ldm.meteo.psu.edu
traceroute to ldm.meteo.psu.edu (128.118.28.12), 30 hops max, 38 byte
packets
 1  flra-n140 (128.117.140.252)  0.319 ms  0.240 ms  0.223 ms
 2  vbnsr-n2.ucar.edu (128.117.2.252)  0.964 ms  0.679 ms  0.669 ms
 3  internetr-n243-104.ucar.edu (128.117.243.106)  1.352 ms  0.864 ms 
1.016 ms
 4  denv-abilene.ucar.edu (128.117.243.126)  1.707 ms  1.788 ms  1.938
ms
 5  kscy-dnvr.abilene.ucaid.edu (198.32.8.14)  12.345 ms  12.369 ms 
12.776 ms
 6  ipls-kscy.abilene.ucaid.edu (198.32.8.6)  23.108 ms  21.997 ms 
21.567 ms
 7  clev-ipls.abilene.ucaid.edu (198.32.8.26)  27.751 ms  27.635 ms 
28.075 ms
 8  abilene.psc.net (192.88.115.122)  31.527 ms  31.532 ms  31.613 ms
 9  penn-state.psc.net (198.32.224.66)  35.777 ms  46.238 ms  36.103 ms
10  * * *
11  ldm.meteo.psu.edu (128.118.28.12)  37.888 ms  37.277 ms  36.992 ms
12  ldm.meteo.psu.edu (128.118.28.12)  37.623 ms  36.748 ms  39.593 ms

Also, for the network folks, are you connected to both Abilene and the
commodity internet?  If so, how does the routing occur between the two
networks?  Are either of them ever saturated?

In doing a traceroute from your host we noticed a hop that said
"FastEthernet" - here's an example:

traceroute to motherlode.ucar.edu (128.117.13.119), 30 hops max, 38 byte
packets
 1  128.118.28.1 (128.118.28.1)  0.661 ms  0.553 ms  0.538 ms
 2  Willard1-FastEthernet11-1-0.gw.psu.edu (146.186.163.1)  1.518 ms 
1.197 ms  1.148 ms
 3  Telecom2-ATM3-0-0.2.gw.psu.edu (172.30.255.78)  3.099 ms  1.970 ms 
1.543 ms
 4  198.32.224.253 (198.32.224.253)  5.564 ms  5.749 ms  5.148 ms
 5  abilene-psc.abilene.ucaid.edu (192.88.115.121)  9.447 ms  9.263 ms 
9.276 ms
 6  198.32.8.25 (198.32.8.25)  15.232 ms  15.318 ms  14.864 ms
 7  198.32.8.5 (198.32.8.5)  25.160 ms  24.618 ms  24.577 ms
 8  198.32.8.13 (198.32.8.13)  35.213 ms  35.233 ms  39.464 ms
 9  ncar-abilene.ucar.edu (128.117.243.125)  36.770 ms  38.700 ms 
36.744 ms
10  vbnsr-n243-104.ucar.edu (128.117.243.105)  37.167 ms  39.236 ms 
36.478 ms
11  mlrb-n2.ucar.edu (128.117.2.254)  37.192 ms  36.713 ms  36.375 ms
12  motherlode.ucar.edu (128.117.13.119)  36.996 ms  37.238 ms  36.649
ms

This caused Mike to wonder if other ethernet connections were only
10Mbs.  Although that seems unlikely, you might ask them.   Also, are
any campus links saturated?   

I wish I could provide you with a more conclusive response!

Anne
-- 
***************************************************
Anne Wilson                     UCAR Unidata Program            
address@hidden                 P.O. Box 3000
                                  Boulder, CO  80307
----------------------------------------------------
Unidata WWW server       http://www.unidata.ucar.edu/
****************************************************