[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030826: IDD latencies and UNIWISC feed not available to rossby



>From: Mike Voss <address@hidden>
>Organization: SJSU
>Keywords: 200308251458.h7PEwJLd022746 IDD packet shaping UNIWISC

Mike,

>We have finally fixed our network problem, although the exact cause
>is still not known. Basically, every router between me and the campus
>firewall was rebooted. It's not traffic shapping, (in fact I
>have 5 Mbps guarenteed throught the firewall on port 388), and not really
>congestion related, just gummed up switches and or routers.
>(of course this is the first day of school!!)

I am glad to hear that your feed is back to almost normal.  It would
have been nice to know that you are guaranteed 5 Mbps for port 388
traffic, however; this is the first I heard of this.  The symptoms
of your problem (rtstats time series traces of latencies) was virtually
identical to cases where packet shaping was the culprit.

>The other problem I had all day and I should have asked earlier is that I
>can't get UNIWISC feed for some reason. I tried all sorts of things with
>rebuilding queues and altering my request line etc. When I do a "notifyme"
>on all my feed types I see data available, but when I check UNIWISC I
>get:
>rossby:~/logs>notifyme -vl - -h thelma.ucar.edu -f UNIWISC
>Aug 26 09:36:01 notifyme[17953]: Starting Up: thelma.ucar.edu:
>20030826093601.608 TS_ENDT {{UNIWISC,  ".*"}}
>Aug 26 09:36:01 notifyme[17953]: Connected to upstream LDM-5
>Aug 26 09:36:01 notifyme[17953]: NOTIFYME(thelma.ucar.edu): OK
>------
>...but then nothing.
>My log shows that the request was good:
>----snip-----
>Aug 26 09:16:58 rossby thelma[17747]: Desired product class:
>20030826081653.511 
>TS_ENDT {{UNIWISC,  "^pnga2area"}}
>Aug 26 09:16:58 rossby thelma[17747]: Connected to upstream LDM-6
>Aug 26 09:16:58 rossby thelma[17747]: Upstream LDM is willing to feed
>-----snip----
>
>It almost 3 am by now so I'm probably missing something obvious. Any
>ideas on why I can't seem to get UNIWISC from thelma?

The was a problem elsewhere.  Realtime stats volume plots for the
UNIWISC feed for thelma and atm.geo.nsf.gov (another top level IDD
relay node), show no UNIWISC data even though the top level relay for
the feed, unidata2.ssec.wisc.edu, shows that the feed was available:

UNIWISC on thelma:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?UNIWISC+thelma.ucar.edu

UNIWISC on atm:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?UNIWISC+atm.geo.nsf.gov

UNIWISC on unidata2:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?UNIWISC+unidata2.ssec.wisc.edu

Investigations as to why neither of these top level IDD relay nodes was
getting the data are now underway.

Tom

>From address@hidden Mon Aug 25 17:38:10 2003

>While this might be associated with packet shaping, the latest round of 
>worms and virii has brought Texas A&M to a near halt.  We're seeing 
>frequent network drops, and lots of latency.  Congestion is the key cause.

>One of the network symptons of what we're seeing is, literally, hundreds 
>of millions of ping (ICMP echo) packets being sent out at line-rate by 
>XP systems.  We're taking draconian steps to get our network back into 
>control.  I suspect if we're seeing it, others are, too.  This could 
>well be the problem.

>I expect the next couple of days are going to be pretty rough for all of 
>us; the network folks have their hands full and we're going to see rally 
>ugly network performance.

>For what it's worth, if TAMU's seen high levels of ICMP we've turned off 
>managed ports on those machines.  For the places with shared hubs 
>instead of really managed ports, we've had to cur department connections 
>and a few building connections.

>Our classes don't start for a week, but the network folks are trying 
>desperately to get things under control before then.  Right now, we 
>merely have have dorm move-in, which is what coincided with our burst in 
>congestion.  Since we didn't see such a burst last year and our 
>network's got more capacity now, I expect what we're seeing is the 
>effect of sheltered student machines coming on-campus, being exposed to 
>the wilds of the internet, and becoming infected.  Oh... an interesting 
>metric.  The half-life of a Windows XP machine not backed up by a decent 
>firewall is about 30 seconds here...

>Gerry

>From address@hidden Tue Aug 26 09:51:59 2003

Tom et al,

Thanks for your assistance, we have fixed our network problem, which
turned out to be a "gummed up" router for lack of a better explanation.
I basically ran two tests to convince our network folks that "my"
problem was "our" problem, 1) I ran ping statistics to show that
dropped packets did not happen until I got to the router, and 2) I
showed them ftp download times from machines before and after the
router, which indicated download speeds of ~1000 Kbytes/sec as opposed
to the terrible ~35 Kbytes/sec I was getting after the router (this is
why I was asking people to ftp to motherlode, so I would have some
bench marks). Anyhow, were back in business, thanks again,

-Mike