[Linux-HA] Re: Linux-HA Digest, Vol 48, Issue 69
Frank
frank at si.ct.upc.edu
Wed Nov 21 00:34:18 MST 2007
> Date: Tue, 20 Nov 2007 12:25:28 +0100
> From: Andrew Beekhof <beekhof at gmail.com>
> Subject: Re: [Linux-HA] confused about setting drac5 stonith
> To: General Linux-HA mailing list <linux-ha at lists.linux-ha.org>
> Message-ID: <5C2B001C-A9A0-4314-AB17-19FB7B9C7829 at gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
>
>
> On Nov 20, 2007, at 11:48 AM, Frank wrote:
>
>
>> Hi,
>> I've seen some discussions about this, but I'm still confused.
>> I'm using the drac5 stonith plugin from Thomas Paschy (thanks
>> thomas) but I don't know how to cofigure it to work fine.
>>
>> We have a cluster with 2 nodes, each one with its public address,
>> and each one with a drac5 device with a private address;
>> le'ts call them node1, node2, node1_drac and node2_drac. So node1
>> can make a reset to node2 connenting to node2_drac,
>> and node2 can make a reset to node1 connenting to node1_drac
>>
>> So we have created two stonith resources called stonith_node1_drac
>> (with node1_drac address) and stonith_node2_drac (with
>> node2_drac address); stonith_node1_drac needs to run on node node2
>> because it is node2 which can reboot node1, an stonith_node2_drac
>> needs to run on node node1 because it is node1 which can reboot node2
>>
>> If we started this way, they starts ok. But when we forced a stonith
>> condition on node2 (killing heartbeat) node 1 is not able to reset
>> node2; we got
>> this on logs:
>>
>> pengine[8955]: 2007/11/20_11:05:05 WARN: stage6: Scheduling Node
>> kripton for STONITH
>> pengine[8955]: 2007/11/20_11:05:05 info: native_stop_constraints:
>> drac_argon_stop_0 is implicit after kripton is fenced
>> pengine[8955]: 2007/11/20_11:05:05 WARN: process_pe_message:
>> Transition 80: WARNINGs found during PE processing. PEngine Input
>> stored in: /var/lib/heartbeat/pengine/pe-warn-75.bz2
>> pengine[8955]: 2007/11/20_11:05:05 info: process_pe_message:
>> Configuration WARNINGs found during PE processing. Please run
>> "crm_verify -L" to identify issues.
>> stonithd[3978]: 2007/11/20_11:05:35 ERROR: Failed to STONITH the
>> node kripton: optype=RESET, op_result=TIMEOUT
>> tengine[8954]: 2007/11/20_11:05:35 info: tengine_stonith_callback:
>> call=-43, optype=1, node_name=kripton, result=2, node_list=,
>> action=8:80:84a50e41-ffda-4c9a-959a-76a61919413a
>> tengine[8954]: 2007/11/20_11:05:35 ERROR: tengine_stonith_callback:
>> Stonith of kripton failed (2)... aborting transition.
>>
>> (kripton is node2 and argon is node1)
>> It seems that there is something messy with the addresses.
>> Can anyone help?
>>
>
> not without logs, configurations and the version your using
>
>
We are using heartbeat 2.1.2
I think I have understood the problem. When node1 wants to stonith node2
and asks its stonith device which hosts it can stonith, device
answers it can stonith node2_drac, but not node2; so node1 doesn't know
how to stonith node2
I fixed the problem adding an extra parameter "hostname" to drac5 plugin
so it can returns hostname when asked for which host it fences and
it uses ipaddr to access to the drac of the host (original drac5 plugin
have only ipaddr,login and password parameters). And it works!
But I suppose that there is a similar problem witch drac3 plugin
included in heatbeat; so how can people use it ? Does anybody know?
Frank
--
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.
For all your IT requirements visit: http://www.transtec.co.uk
More information about the Linux-HA
mailing list