[Linux-HA] Fencing prevents resource from failing over

Andrew Beekhof beekhof at gmail.com
Mon Nov 26 01:13:37 MST 2007


On Nov 26, 2007, at 6:25 AM, <abhishek.bagchi at wipro.com> <abhishek.bagchi at wipro.com 
 > wrote:

>
> Hi,
> I've a 2 node active/passive cluster ( active node=>active , passive
> node=>standby) using heartbeat 2.0.8 . I recently enabled stonith .  
> The
> stonith device is an rsh device that tries to restart the cluster  
> node.
> However, something that used to work with stonith disabled has stopped
> working now ; Node failover on network cable disconnection. I believe
> since the stonith device uses the network, the stonith fails and hence
> the resource is left wherever it was running.

correct.  the cluster will not start anything until it can verify the  
node is truly dead (with a successful stonith operation)
this is how a stonith enabled cluster is supposed to work and is why  
IP-based stonith modules are not a great idea.

> Can anyone please help resolve this problem (this is probably not a
> problem and this is how stonith is expected to work )? I would like to
> know if there's anyway to tell the passive (currently active node) to
> give up trying to stonith and then start the resource.

by design - no.

> I've attached my
> cib file and logs from the passive when cable is disconnected.
> I've no problem both nodes running the resource as active is anyway
> cut-off from network and can't do any damage.

if thats truly the case, then you may not need stonith.

> The standby log seems to
> say it has quorum

2-node clusters always have quorum, so the value is meaningless...

> but it makes me wonder why it doesnt start the
> resources , inspite of the following evident from the logs.
>
> 1. Standby marks active unclean
> 2. Standby has quorum
> 3. Standby tries to move resources back to standby
>
>
> Thanks in advance,
> Abhi.
>
>
>
>
>
>
>
>
>
> The information contained in this electronic message and any  
> attachments to this message are intended for the exclusive use of  
> the addressee(s) and may contain proprietary, confidential or  
> privileged information. If you are not the intended recipient, you  
> should not disseminate, distribute or copy this e-mail. Please  
> notify the sender immediately and destroy all copies of this message  
> and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The  
> recipient should check this email and any attachments for the  
> presence of viruses. The company accepts no liability for any damage  
> caused by any virus transmitted by this email.
>
> www.wipro.com<ha-log- 
> standby.txt><cib.xml>_______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems




More information about the Linux-HA mailing list