[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 19990429: Product Duplicates/Checksum calculation



> >To: address@hidden
> >cc: Paul Hamer <address@hidden>
> >From: Paul Hamer <address@hidden>
> >Subject: Product Duplicates
> >Organization: .
> >Keywords: 199904292059.OAA29086
> 
> 
> Hi there,
> 
> We're running LDM on our platforms here at FSL and have just set up 
> a configuration that has highlighted a small problem for us.
> 
> We receive a feed from an external agency that we filter internally
> and re-identify a subset of these data with a new feedtype and indent.
> The MD5 checksum only works over the actual data part which means that
> the signiture remains the same. So now we can't now insert the newly
> identified data back the queue for distribution to someone else because
> LDM has it down as a duplicate.
> 
> What would seem sensible would be to calc the checksum over the
> entire product structure but since that's not done do you have
> any suggestions?
> 
> Thanks,
> 
> Paul.
> 
> 
> -- 
> Paul Hamer
> Email: address@hidden
> Phone: 303.497.6342

Paul:

I think I would run the ldm which gets data from the
"external agency" in isolation. (You are probably doing this
already.) I would modify 'pqsend' to do the subsetting and
re-identifying, and send the re-identified subset to
another machine. The main idea is avoid sticking the re-ident
product back in the same queue.


> What would seem sensible would be to calc the checksum over the
> entire product structure

It is a matter of semantics. It is supposed to be a checksum of
the cargo. I think I would leave it that way and change the
"equality" test from "cargo the same" to "cargo and ..."
The reason we do it this way is historical. Now that we have
NOAAPort/AWIPS, we have reliable sequence numbers so we can
do a lot of things differently.

Others have been bitten by this as well.

-glenn