[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20050309: LDM running in a clustered environmet



Aloha Angelo,

> To: address@hidden
> From: "angelo alvarez" <address@hidden>
> Subject: LDM - LDM running in a clustered environmet
> Organization: UCAR/Unidata
> Keywords: 200503092027.j29KRStV022855

The above message contained the following:

> Institution: Naval Pacific Meteorology and Oceanography Center Joint Typhoon 
> Warning Center
> Package Version: ldm-6.0.15
> Operating System: Solaris 8
> Hardware Information: Sunfire V240
> Inquiry: Aloha.  We have 2 Sun Servers which share a raid module and
> are connected to an Alteon switch and share the same virtual IP.  This
> configuration is designed to provide failover should either system fail.
> The alteon will send data to which ever system is "online" at the time.
> The data resides on the shared cluster, and we use the Veritas service
> manager to control which services are started when a system becomes the
> primary.  Here is my question: Is it possible for ldm to reside on the
> shared cluster so that either system can run ldm, access the same queue,
> and connect to the upstream LDM when they become the primary server.  My
> concern is that if the backup system becomes the primary and the queue
> does not match that of its predicessor, then it will retrieve data (from
> the upstream LDM) which was already retrieved.  Based on what I have
> described, is there a better way to do this?

The system you describe seems perfectly doable if the following are
observed:

    1.  The LDM configuration-files should be identical.

    1.  Only one LDM should be active at a time.  This will avoid
        concurrent access to the product-queue by the two LDM-s and
        consequent dependence on correct operation of file-locking
        between the two systems (which can be problematical).

    2.  The management software, when it notices that the primary LDM
        system is down, should ensure that the LDM is truely down by
        killing it.

    3.  The management software should then migrate the logical IP
        address to the secondary system and then start the LDM on that
        system.

When the LDM on the secondary system starts, it will automatically
search backwards through the product-queue for each REQUEST entry in
the LDM's configuration file and note the creation-time of the first
matching data-product.  It will use this time in its data-product
selection-criteria that it sends to the upstream LDM; consequently,
there will be few, if any, duplicate products sent from the upstream
LDM.

The regional HQ-s of the National Weather Service use a system very
similar to this.  They don't use a RAID, however, so each LDM has it's
own product-queue and is usually active and receiving data.

I can probably put you in touch with a person who set this up for
them, if you wish.

Regards,
Steve Emmerson

NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publicly available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.