[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050217: Need help to resolve long latency issue on IDD/LDM



>From: Shing Yoh <address@hidden>
>Organization: Kean University
>Keywords: 200502172036.j1HKaUv2005158 IDD packet shaping

Hi Shing,

>Since a few months ago when packet shaper software was installed on our 
>campus network, we encountered serious issue on ingesting model data 
>(missing data and high latency).

The classic signature of packet shaping is high latencies for feeds
that contain a lot of data (e.g., HDS), and low latencies for feeds
that do not (e.g., IDS|DDPLUS).

>From the information that I can gather 
>so far from the Unidata web pages and email archive, I had talked to the 
>IT people on the campus and had already taken the following steps :
>
>(1) I have changed the product queue size in ldmadmin $pq_size=1000000000 
>and restart the ldm.  The file size of ldm.pq is running at 1015898112 
>byte.

The LDM queue size will have no effect on the latencies you see.  The
only thing it might do is help keep older data in the queue longer.
I am not sure that this would be important for your use, but it might
be.

>(2) I have talked to the IT people and I was told that we are currently 
>set at 800kps for bandwidth to our server "hurri.kean.edu".  The setting 
>is for the all ports and traffic to our server since the software can not 
>be set for just ldm IP port 388.

This is the first time that I have heard of a packet shaper that can not
be turned off for activity on a certain port.  I would argue that if
they can turn off packet shaping for an individual port, then they
should do so for port 388 since it is only used for the LDM and the
LDM is moving scientific data needed for education and research.  It
is not moving MP3s or movie sreams.

>(3) The above rule for traffic to our server has priority 10, the highest.

This is good, but not sufficient.

>Unfortunately, with all these changes, our latency for HDS model data
>is still at 4000s (with a much reduced set of model data then what we used 
>to ingest).

Three things that I need to mention here:

1) the clock on your system is not being maintained.  Please take a look
   at the latency plot for your IDS|DDPLUS ingestion:

http://my.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?IDS|DDPLUS+hurri.kean.edu

   The stair step in baseline latency is an indication that your clock is
   off by about 50 seconds, and that it is drifging at a rate of about 1 second
   every 12 hours.  I strongly recommend that you setup ntpd on your machine
   so that the clock be maintained correctly.  An accurate clock is essential
   to receiving all data in a the IDD especially when the LDM is stopped
   and restarted for some reason.

2) your system _is_ showing the classic packet shaping signs.  The latency
   for IDS|DDPLUS is low, and the latency for HDS is high.  This means
   that the input to hurri is not limited to 800 Kbps in total, but,
   rather, each connection is limited individually.  This makes me believe
   that your IT folks _could_ turn off the packet shaping for all traffic
   on port 388.  I also have the sneaking suspicion that they could
   change the limit for port 388 traffic individually, but I can't guarantee
   that would be the case.

3) given that the packet shaping is being applied on a connection-by-connection
   basis, you could split up your HDS data request into a number of
   requests each one of which would be asking for a smaller portion of
   the whole feed.  Before doing this, however, I urge you to continue
   to fight the good fight with your IT folks and press for there being
   no packet shaping for port 388 traffic.

   If you can not win the war with your IT folks, the first thing you
   should do is determine which of the grids in the HDS feed are the
   most important to you.  If there are any grids that you absolutely
   do not use, then eliminate them from your data request to cut the
   volume.  Then take the remaining set of data you want, and try
   to break it up into a number of requests each of which will be
   small enough to fly under the packet shaping radar.

>I am running out of ideas to try or suggestions to IT people.  Are there 
>any suggestions from Unidata that I or our IT people should try or 
>investigate in order to resolve this long latency issue ?

Again, I am not convinced that your IT people are aware of their ability
to alter the packet shaping profile by port.  We ran into a packet shaping
situation at LSU, and had a conference call between the SRCC folks (the
users of the LDM), the campus IT folks, and several of the UPC technical
staff.  After the IT staff were made aware of the kind of dat athat
is being ingested, and its use in research and education, they bored a
hole in their packet shaper that allows transfer rates of 20 Mbps.  The
same thing could be done for Kean.

>By the way, the packet shaper software that they are using is the 
>NetEnforcer by Allot Communications, Version 5.1.   Any help will be 
>appreciated.

Thanks for passing along this information.  Our experience to date
has been with sites using a system from Packeteer.  It might well
be that NetEnforcer does not have as many controls as the system
from Packeteer.  I am suspicious that this is not the case, however.
It may boil down to your IT folks not knowing all that they can
do with NetEnforcer.

>Dr. Shing Yoh 
>Dept. Geology & Meteorology            K     K EEEEEEE    A    N     N
>Kean University                        K    K  E         A A   NN    N
>1000 Morris Avenue                     K   K   E        A   A  N N   N
>Union, New Jersey 07083                KKKK    EEEEEE  A     A N  N  N
>Voice : 908 737 3692                   K  K    E       AAAAAAA N   N N
>Fax   : 908 737 3699                   K   K   E       A     A N    NN
>Email : address@hidden                  K    K  E       A     A N     N
>Web   : http://hurri.kean.edu/~yoh     k     K EEEEEEE A     A N     N

Cheers,

Tom Yoksas
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.

>From address@hidden  Fri Feb 18 08:49:31 2005

Tom and Steve,
        (1) I have restarted "ntpd" to see if it will fix our system
clock problem.
        (2) I will forward this email to our IT people and start another
round of discussion with them.  Might be the suggestion is to leave
the whole IP port 388 open for the whole network rather than just our 
server.  Anyway, hope that our IT people will be in better position to 
discuss this issue with their software support people also.

Will keep you informed on our progress.  Any more suggestions will be
appreciated and thanks again.

Shing