AW: [LinuxFailSafe] startup of resource groups with one node down

Padmanabhan Sreenivasan paddy@sgi.com
Tue, 07 May 2002 12:07:43 -0700


Martin Bene wrote:

> > Von: Padmanabhan Sreenivasan [mailto:paddy@sgi.com], 07. Mai 2002 01:20
>
> > FailSafe notion of tie-breaker is different. Tiebreaker node
> > gets the first chance to reset other node in a two node cluster
> > in case of network partition.
> > If only one node in the cluster is operational, start HA services only on that
> > node. When the other node is available, you can start HA services on that node. It
> > should rejoin the cluster.
>
> Thanks for that hint. I didn't realize the results of starting ha services for the cluster or starting it for just one node are quite different if only one node is available at startup:
>
> start ha services for cluster:
>         * membership only comes up if tiebreaker node is available,

this is not true. membership can form even if the tiebreaker node is not available. In
a 2 node cluster with network partition, if the tiebreaker node is not able to
reset the non-tiebreaker node and non-tiebreaker node can successfully reset
the tiebreaker node, a membership of one node (non-tiebreaker node)
can be formed.

>
>         * unavailable node gets reset
>         * node status is 1x UP, 1x DOWN
>         * bringing up resource groups fails.
>
> start up ha services just for the available node:
>         * membership comes up regardles of tiebreaker node
>         * unavailable node doesn't get reset
>         * node status is 1x UP, 1x inactive
>         * bringing up resource groups works.
>
> I feel much better now that I have known-to-work recipiece for bringing up services even from "both-nodes-down, one node kaputt" status.
>
> That said, I still don't understand where the first case differs from starting up with both nodes present and then loosing one :-)

The key difference is in the second case, you are making an assumption that there are no
HA resources running on the node (which is hopefully down).

Paddy

>
>
> Thanks again, Martin
> _______________________________________________
> LinuxFailSafe mailing list
> LinuxFailSafe@lists.community.tummy.com
> http://lists.community.tummy.com/mailman/listinfo/linuxfailsafe