[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[IDD #RKL-664278]: Latency/Bandwidth issues from idd.unidata ?



Hi Daryl,

> They are currently:
> 
> request NEXRAD2 "(.*)"  idd.aos.wisc.edu
> request WMO     ".*"    idd.aos.wisc.edu

Good. No overlap.

> during the course of Pete's provided logs though they have been changing as I 
> attempt to isolate some working/stable config.  So show an example from 
> Pete's log that highlights my impression that `ldmadmin stop` is not being 
> properly seen.

Pete's LDM log file shows that his LDM didn't have anything to send to yours 
when you restarted your LDM -- so it didn't see your LDM stop (a sender only 
notices a closed connection when it attempts to write). Consequently, Pete's 
LDM correctly terminated the previous upstream LDM when your second request 
came in.

> We find my initial startup request, process 39401
> 
> 20220308T171443.401322Z dc-fw-low-cals-0.nat.iastate.edu(feed)[39401] 
> uldb.c:sm_vetUpstreamLdm:1530 NOTE Terminated redundant upstream LDM 
> (addr=129.186.244.7, pid=38582, vers=6, type=feeder, mode=primary, 
> sub=(20220308154938.829333 TS_ENDT {{NEXRAD2, "(.*)"}}))
> 20220308T171444.411734Z dc-fw-low-cals-0.nat.iastate.edu(feed)[39401] 
> up6.c:up6_run:445 NOTE  Starting Up(6.13.16/6): 20220308161443.234042 TS_ENDT 
> {{NEXRAD2, "(.*)"}}, SIG=018016cb0895eb95fb49258f0c4a7a18, Primary
> 20220308T171444.411802Z dc-fw-low-cals-0.nat.iastate.edu(feed)[39401] 
> up6.c:up6_run:448 NOTE  topo:  dc-fw-low-cals-0.nat.iastate.edu {{NEXRAD2, 
> (.*)}}
> 
> I then stop my LDM some two hours later
> 
> 20220308T192332.618449Z dc-fw-low-cals-0.nat.iastate.edu(feed)[39401] 
> error.c:err_log:237 NOTE Couldn't flush connection; flushConnection() failure 
> to dc-fw-low-cals-0.nat.iastate.edu: RPC: Unable to receive; errno = 
> Interrupted system call

The above log message comes after the following one.
 
> and start a new connection, process 11876
> 
> 20220308T192332.618241Z dc-fw-low-cals-0.nat.iastate.edu(feed)[11876] 
> uldb.c:sm_vetUpstreamLdm:1530 NOTE Terminated redundant upstream LDM 
> (addr=129.186.244.7, pid=39401, vers=6, type=feeder, mode=primary, 
> sub=(20220308161443.234042 TS_ENDT {{NEXRAD2, "(.*)"}}))
> 
> my old one is now dead
> 
> 20220308T192332.937905Z dc-fw-low-cals-0.nat.iastate.edu(feed)[39401] 
> ldmd.c:cleanup:197 NOTE Exiting

This is exactly correct: the second request comes in, the previous upstream LDM 
process is sent a SIGTERM, and the signal interrupts a system call causing the 
previous upstream LDM process to terminate.

> and my new one is up:
> 
> 20220308T192333.627634Z dc-fw-low-cals-0.nat.iastate.edu(feed)[11876] 
> up6.c:up6_run:445 NOTE  Starting Up(6.13.16/6): 20220308182332.545706 TS_ENDT 
> {{NEXRAD2, "(.*)"}}, SIG=59f84a8b26fac5ed9761b24bbe4ea00f, Primary
> 20220308T192333.627696Z dc-fw-low-cals-0.nat.iastate.edu(feed)[11876] 
> up6.c:up6_run:448 NOTE  topo:  dc-fw-low-cals-0.nat.iastate.edu {{NEXRAD2, 
> (.*)}}
> 
> the logs from my end during this time:
> 
> 20220308T171443.233757Z idd.aos.wisc.edu[81149]     
> LdmConfFile.c:requester_exec:740    NOTE  Starting Up(6.13.15): 
> idd.aos.wisc.edu:388 20220308161443.233727 TS_ENDT {{NEXRAD2, "(.*)"}}
> 20220308T171443.234071Z idd.aos.wisc.edu[81149]     requester6.c:req6_new:496 
>           NOTE  LDM-6 desired product-class: 20220308161443.234042 TS_ENDT 
> {{NEXRAD2, "(.*)"},{NONE, "SIG=018016cb0895eb95fb49258f0c4a7a18"}}
> 20220308T171443.409326Z idd.aos.wisc.edu[81149]     
> requester6.c:make_request:222       NOTE  Upstream LDM-6 on idd.aos.wisc.edu 
> is willing to be a primary feeder
> 20220308T192312.081315Z ldmd[81133]                 ldmd.c:cleanup:256        
>           NOTE  Terminating process group
> 20220308T192312.082001Z idd.aos.wisc.edu[81149]     svc_tcp.c:readtcp:366     
>           ERROR Bad file descriptor
> 20220308T192312.082046Z idd.aos.wisc.edu[81149]     svc_tcp.c:readtcp:366     
>           ERROR read() error on socket 3
> 20220308T192312.082063Z idd.aos.wisc.edu[81149]     svc_tcp.c:writetcp:388    
>           ERROR Bad file descriptor
> 20220308T192312.082075Z idd.aos.wisc.edu[81149]     svc_tcp.c:writetcp:388    
>           ERROR writetcp(): write() error on socket 3
> 20220308T192312.082407Z idd.aos.wisc.edu[81149]     ldmd.c:cleanup:197        
>           NOTE  Exiting
> 
> I do notice this log example is with 6.13.15 on my end.  I just stopped and 
> started on my end now with 6.13.16 and did not get those same logs on my end 
> for whatever reason.

Yeah. I pruned the messages because they were redundant.

> My last latency blip was just after 22z
> 
> https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?IDS|DDPLUS+idd-dc.agron.iastate.edu

I think it's highly unlikely that the restarting of your LDM and the subsequent 
termination of the upstream LDM process of Pete's LDM has anything to do with 
your latency blip.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: RKL-664278
Department: Support IDD
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.