[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #UOJ-259840]: Late data not getting past relay



Art,

> I don't specify the "-o" option because I thought it defaulted to the "-m"
> value which is what the documentation seems to indicate.  But reading the
> documentation again, it appears that the "default" statement is a little
> misleading since after that, it says that "initial requests will be for
> the most recent data in the queue which is younger than max_latency" when
> not using a new queue, which is probably true in most cases.  That
> behaviour would agree with what I was seeing because the CONDUIT data was
> much older than the other streams in the queue, so when I restarted the
> LDM, the most recent data were the streams that were current and, thus, it
> used that time to determine when it would start receiving CONDUIT data
> again which means all the backlogged CONDUIT data was lost.
> 
> This whole approach, however, creates a hole for data loss unless both the
> "-m" and "-o" options are always specified with values large enough on a
> restart to reach back to the oldest data not yet received (if I'm
> understanding all this correctly).  But I'm also guessing that this would
> potentially force a lot of data already received to be re-requested,
> correct? Without them, though, it means I can't stop/start the LDM without
> varying degrees of data loss depending on what the most recent and oldest
> data are in any of my streams.

You've put your finger on a problem with the LDM (and a reason I'm working on a 
replacement).  I take it that CONDUIT data-products are being delivered as a 
subset of a larger request for data (e.g., ANY).  In this case, reception of 
CONDUIT data-products will be "held hostage", so to speak, to the latest 
data-product and matches ANY.  If this is the case, then my advice to you is to 
break-out the request for CONDUIT data-products into its own request.

> I remember pointing out a problem similar to this awhile ago for the
> decoders which are also prone to losing data when the LDM is
> stopped/started depending on how far backlogged in the queue they are when
> stopped.  I think the problem was described then as the LDM not being a
> stateful package meaning it's unable to keep track of where it has left
> off in the decoding process.

Data-product decoding can have a gap due to an LDM restart if the decoders are 
backlogged.

> For LDM data collection itself, I'm guessing
> that the LDM doesn't keep track of where it left off collecting particular
> data streams either.  Has any further thought been given to making the LDM
> a stateful package that can be stopped/started seemlessly avoiding loss of
> data?

I've given a lot of thought to the shortcomings of the LDM, which is why I'm 
working on a replacement.  Unfortunately, it won't be out until next year, 
sometime.

In the meantime, separate your requests into multiply REQUEST entries to 
workaround this problem.  Please tell me if this hypothesis is incorrect.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: UOJ-259840
Department: Support LDM
Priority: Normal
Status: Closed