[Linux-HA] Fencing prevents resource from failing over
abhishek.bagchi at wipro.com
abhishek.bagchi at wipro.com
Mon Nov 26 01:56:07 MST 2007
Thanks Andrew,
My comments are inline...
-----Original Message-----
From: linux-ha-bounces at lists.linux-ha.org
[mailto:linux-ha-bounces at lists.linux-ha.org] On Behalf Of Andrew Beekhof
Sent: Monday, November 26, 2007 1:44 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Fencing prevents resource from failing over
On Nov 26, 2007, at 6:25 AM, <abhishek.bagchi at wipro.com>
<abhishek.bagchi at wipro.com > wrote:
>
> Hi,
> I've a 2 node active/passive cluster ( active node=>active , passive
> node=>standby) using heartbeat 2.0.8 . I recently enabled stonith .
> The
> stonith device is an rsh device that tries to restart the cluster
> node.
> However, something that used to work with stonith disabled has stopped
> working now ; Node failover on network cable disconnection. I believe
> since the stonith device uses the network, the stonith fails and hence
> the resource is left wherever it was running.
correct. the cluster will not start anything until it can verify the
node is truly dead (with a successful stonith operation) this is how a
stonith enabled cluster is supposed to work and is why IP-based stonith
modules are not a great idea.
> Can anyone please help resolve this problem (this is probably not a
> problem and this is how stonith is expected to work )? I would like to
> know if there's anyway to tell the passive (currently active node) to
> give up trying to stonith and then start the resource.
by design - no.
> I've attached my
> cib file and logs from the passive when cable is disconnected.
> I've no problem both nodes running the resource as active is anyway
> cut-off from network and can't do any damage.
if thats truly the case, then you may not need stonith.
ABHI: But, if the Active comes online again it's a very bad thing for
both nodes to be running the resources. Can we configure two stonith
devices and make the node think stonith is successful if either of the
stonith operations return success.Is their some kind of resource
constraint that I can use in this case ?
1. Online stonith device: That uses IP to reset the other node.
2. Offline stonith device: That is just dummy and on reset always
returns success.
> The standby log seems to
> say it has quorum
2-node clusters always have quorum, so the value is meaningless...
> but it makes me wonder why it doesnt start the resources , inspite of
> the following evident from the logs.
>
> 1. Standby marks active unclean
> 2. Standby has quorum
> 3. Standby tries to move resources back to standby
>
>
> Thanks in advance,
> Abhi.
>
>
>
>
>
>
>
>
>
> The information contained in this electronic message and any
> attachments to this message are intended for the exclusive use of the
> addressee(s) and may contain proprietary, confidential or privileged
> information. If you are not the intended recipient, you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately and destroy all copies of this message and any
> attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient
> should check this email and any attachments for the presence of
> viruses. The company accepts no liability for any damage caused by any
> virus transmitted by this email.
>
> www.wipro.com<ha-log-
> standby.txt><cib.xml>_______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
More information about the Linux-HA
mailing list