tl;dr -- LDM setup which worked fine last week, now will not send/receive
any files larger than 1292 bytes.
Full story:
We get data via LDM from another system. This setup/connection was
working fine until last week. As far as we know, no changes were made --
but obviously something has changed, because now we can't get data.
Both ends are running ldm-6.13.11, which is recent and has been working
well (except for pqact issues, which don't apply here).
I see connectivity at both ends, and I have restarted and rebuilt the
queues on both ends multiple tiles during troubleshooting.
I have enabled traffic both ways, and can ldmping and run notifyme against
the other machines queue(s).
Interestingly enough the issue seems to have something to do with
filesize. In my testing I tried using ldmsend to send files to the
downstream server. I have an "accept" line there, and I *AM* able to send
files* IF* they are <1293 bytes. The downstream server receives data from
many other servers, and many of the files it receives are larger than 1293
bytes.
Interestingly, smaller files make it through, but are taking a
significantly long time. For instance a file of 1274 bytes can take more
than a minute.
When trying to send the larger file, there is nothing in the downstream
logs, but the upstream logs show:
20210216T163901.154847Z dontpanic.nssl.noaa.gov(feed)[20925]
up6.c:up6_run:445NOTE Starting Up(6.13.11/6): 20210216162900.110949
TS_ENDT {{EXP, "/home/operator/ALAtest"}},
SIG=d40ffc815fd74a96c2d7c726dc7012d3, Primary
20210216T163901.154950Z dontpanic.nssl.noaa.gov(feed)[20925]
up6.c:up6_run:448NOTE topo: dontpanic.nssl.noaa.gov {{EXP, (.*)}}
20210216T164000.271093Z 140.172.25.37[20982]ldmd.c:cleanup:192NOTE Exiting
20210216T164001.213937Z dontpanic.nssl.noaa.gov(feed)[20925]
ldmd.c:cleanup:192NOTE Exiting
I tried setting up a second downstream system, but had the same results.
I have also tried using ldmsend to send data, but again, the small files
make it through, but larger packets fail. In verbose mode for ldmsend I
see:
ldmsend -xxx -h dontpanic.nssl.noaa.gov ALAtestfile7
20210216T164634.300292Z ldmsend[21540] error.c:err_log:236
INFO Resolving dontpanic.nssl.noaa.gov to 140.172.25.37 took
0.000755 seconds
20210216T164634.329557Z ldmsend[21540] ldmsend.c:main:437
DEBUG version 6
20210216T164634.359151Z ldmsend[21540] ldmsend.c:ldmsend:281
INFO Sending ALAtestfile7, 1293 bytes
20210216T164634.359234Z ldmsend[21540]
LdmProxy.c:my_hereis_6:549 DEBUG Sending file via HEREIS_6
20210216T164734.361874Z ldmsend[21540] LdmProxy.c:getStatus:68
ERROR NULLPROC_6 failure to host "dontpanic.nssl.noaa.gov":
RPC: Unable to recei
ve; errno = Connection reset by peer
20210216T164734.361940Z ldmsend[21540] ldmsend.c:ldmsend:309
ERROR Couldn't flush connection
20210216T164734.362006Z ldmsend[21540] ldmsend.c:cleanup:82
ERROR Message-queue isn't empty
--
*"Outside of a dog, a book is a man's best friend. Inside of a dog, it's
too dark to read."*
*--Groucho Marx*
-------------------------------------------
Karen.Cooper@xxxxxxxx
Phone#: 405-325-6456
Cell: 405-834-8559
National Severe Storms Laboratory