[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20051110: theoretical ldm behavior / adding a feedtype (cont.)



>From: Rob Cermak <address@hidden>
>Organization: UAF
>Keywords: 200511100150.jAA1oK7s020348 LDM

Hi Rob,

>Yes.  Doing some testing now, the checksum works quite well with LDM 6.4+ 
>to prevent duplicates being sent.

Just so you know, duplicate product detection and rejection has been
in the LDM for over a decade :-)

>pqinsert is also smart enough to block
>duplicate inserts... interesting!

Yup, 'pqinsert' creates an MD5 signature for the product to be inserted
and then goes through the same queue module when trying to insert.

>Mantra: "Use the EXP feedtype"...Got it.

Excellent ;-)

>Now lets see if I follow the rest of this:

re: EXP HDS original header

>To achieve this I would need the corresponding pqinsert command?
>
>pqinsert -f EXP -p "HDS original header"

Yes, that is what I had in mind.  This allows the creation of
"virtual" streams inside the EXP feed type.

>It looks like a general pattern I can follow may be?:
>
>pqinsert -f EXP -p "OOS AOOS other header1 header2"
>
>Overlay a OOS feedtype on EXP, then define a OOS source, like AOOS, then
>unique product IDs?  

Exactly.

>Am I on the right track?

Yup.

>Then nodes participating in the EXP OOS datafeed could tap into the whole
>feed and/or select specific participating OOS sites:
>
>> request      EXP     "OOS"   hostname
>> request      EXP     "OOS AOOS"      hostname

Exactly!  

I have been and continue to encourage the SURA SCOOP folks to structure their
headers to look like:

EXP SCOOP TAMU WRF ...
      ^    ^    ^_____ type of data
      |    |__________ originating site
      |_______________ project

If everyone did this, requesting data would be nice and simple:

To get all data from the SCOOP project:

request EXP     SCOOP   host

To get all SCOOP data from TAMU _only_:

request EXP     "SCOOP TAMU"    host

And so on.

The only constraint is the header can be no longer than 255 characters
in length.

>This was the first step to understand how the LDM could be leveraged for 
>use in the community.  The IOOS DMAC has it definied as an initial 
>protocol to use for data transport, but no real specifics.   I may have to
>write up our small conversation here for the data transport expert team.

Sounds good.

>I guess the other tricky part is wiring the thing up.   Not all sites may 
>request/send everything.  Essentially, the OOS feed is an inverse IDD.  
>The IDD has a few top level nodes and spreading out.  We need to have the 
>various OOS feeds aggregated at a top level node at some point to ensure 
>the entire network is getting everyones data.

This is _exactly_ the same as SCOOP.

>That is a problem now, we 
>don't have a top level node at this point to point at...

TAMU runs the machine that gets all of the data.  Their machine's
hardware was chosen to reflect some lessons learned here at the UPC
building our toplevel IDD relay node cluster, idd.unidata.ucar.edu.  We
chose Sun V20Zs (dual Opteron) and loaded them with lots of RAM (12 GB)
and (currently) Fedora Core 3 64-bit Linux.

Cheers,

Tom
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.