[Linux-HA] migration/fence after fail-count > X
Andrew Beekhof
beekhof at gmail.com
Tue Nov 13 05:39:55 MST 2007
On Nov 13, 2007, at 1:02 PM, Sebastian Reitenbach wrote:
> Hi,
>
> I read in the v2 FAQ the following:
>
> What happens when monitor detects the resource down?
> The node will try to restart the resource, but if this fails, it
> will fail
> over to an other node.
> A feature that allows failover after N failures in a given period of
> time is
> planned.
>
> Is that feature still planned?
thats how it works already - sort of.
there is a layer of indirection with resource-failcount-stickiness,
but basically once failcount hits a threshold - the resource moves.
knowing what to set resource-failcount-stickiness to can be tricky.
one of the easiest, i can turn my brain off, ways is:
1) to start the cluster and make sure everything is running
2) figure out the current score (see conversations regarding the
getscores.sh script that has been posted here)
3) divide said score by X and add 1
> Could it also be instead of failover, fence the node X when
> failcount > X?
no, at least not yet anyway
interesting idea though
> Or is that working already, and the FAQ is not upated?
> At least when I see this:
> http://www.linux-ha.org/v2/faq/forced_failover
> It seems to work already, but only in combination with moving a
> resource to
> another location, but not to be used to fence a node after a critical
> fail-count is reached.
> I've seen the fail_count utility, and tried to find examples on the
> webpage,
> but that search was not too exhaustive.
>
> Also, can the fail-count of different resources be summed up to make a
> decision in combination with fencing? E.g. Resources A, B, C...
> The failcount of A=3, + B=4 = SUM=7 > 6, then fecnce the node where
> that
> limit is reached.
as above. not at the moment
More information about the Linux-HA
mailing list