[Linux-ha-dev] STONITH RAs - Requirements from the CRMd side

Lars Marowsky-Bree lmb at suse.de
Fri Jul 2 04:37:52 MDT 2004


On 2004-07-02T11:42:26,
   Andrew <lists at beekhof.homeip.net> said:

> The CRMd will start the RA on one of its allowed nodes.  Depending on 
> the STONITH controller type it might:
> 		- test connection to the STONITH device and exit
> 		- reserve access to the STONITH device
> 		- not do much at all
> 		- do something I haven't thought of

You have not read the NodeFencing and
http://wiki.trick.ca/linux-ha/LocalResourceManager_2fStonithAgents
pages. On start, the STONITH agent retrieves exactly that list
of nodes it can shoot (which implies a test of the connection, and
reserving access to it if possible).

> The RA needs to send the CRMd a list of the nodes it can shoot.
> 	- Whether this is a separate operation or a result of the "start" 
> 	can be negotiated

start operation is the obvious one.

> 	- Exactly how this information is returned to the CRMd I dont know 
> yet.  What would be the easiest for the LRM?

The STONITH Agent outputs on stdout, and this is relayed back to us
anyway.

> The RA will then be sent a "shoot" operation and the node to shoot will 
> be supplied as the parameter "target".  An additional parameter will 
> specify what the value of "target" is (uname, UUID, MAC Address, 
> whatever).  I haven't used STONITH much but this seems to be a worthy 
> addition.

It's called "fence", not "shoot" ;-)

And no, as the fencing is at the node level, it's not necessary to
further identify the parameter, it's simply the nodename.

> The RA needs to indicate success, internal failure or failure to 
> complete the stonith

Right.

> Eventually the CRMd will stop the RA, again depending on the STONITH 
> controller type it might:
> 		- release access to the STONITH device and exit
> 		- not do much at all
> 		- do something I haven't thought of

Right.

> Thats it :)

You missed the "monitor" operation to constantly be able to verify
whether the device is still reachable & operational and/or the list of
nodes it can fence has changed.


Sincerely,
    Lars Marowsky-Brée <lmb at suse.de>

-- 
High Availability & Clustering	    \ ever tried. ever failed. no matter.
SUSE Labs, Research and Development | try again. fail again. fail better.
SUSE LINUX AG - A Novell company    \ 	-- Samuel Beckett



More information about the Linux-HA-Dev mailing list