[Linux-ha-dev] [Pacemaker] emc symmetrix ocf ra

Florian Haas florian.haas at linbit.com
Mon Jul 13 00:29:26 MDT 2009


Daniel,

Moving this over to linux-ha-dev, per Lars' suggestion.

On 07/10/2009 02:23 PM, daniel peess wrote:
> hi ml,
>
> i've attached a emc symmetrix ocf ra symsrdf we've written in a project.
> it switches the uni-directional replication of the mirror on startup.
>
> two hacks are still in there:
> - to inform the DMMPIO of SLES10sp2 that the block device is now
active i had
>   to restart the daemon.
> - to shut down the DMMPIO before the switch a had to use the volume group
>   name to get to the dynamic chosen multi-path device name.
>
> otherwise this RA works quite nice in production already.
>
> bye,
> Daniel Peess

Had a look at this. Comments inline. I'm a complete EMC nitwit, so
please forgive any idiocies due to lack of Symmetrix knowledge. Feel
free to educate me.

> #!/bin/bash
> ## vim: set ts=4 sw=4 sts=0 noet foldmethod=indent:
> ## purpose: emc symmetrix srdf fail-over replication agent
> ## copyright (c): B1 Systems GmbH <info at b1-systems.de>, 2009.
> ## author: daniel peess <peess at b1-systems.de>, 2009.
> ## license: GPLv3+, http://www.gnu.org/licenses/gpl-3.0.html
> ## version: 1.0
> 
> ## TODO: switch from vgdisplay to pvs to get device mapper name.
> ## TODO: strip the whole multipath handling from this agent?
> ## TODO: volume group name + symmetrix group as comma separated values?
> 
> . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
> 
> PATH="${PATH}:${OCF_RESKEY_sympath}";

You're occasionally parsing command line output from binaries found in
$OCF_RESKEY_sympath. Unless you're certain that the output from those
isn't localized, you may want to set LANG or LC_ALL for your RA.

> mymeta() {
> 	cat <<EOF
> <?xml version="1.0"?>
> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
> <resource-agent name="MyDummy">

Replace with the real RA name please.

> <version>1.0</version>
> <longdesc lang="en">
> This is a Resource Agent to switch EMC Symmetrix SRDF Replication for Mirror Groups.
> </longdesc>
> <shortdesc lang="en">EMC Symmetrix SRDF Switch Replication</shortdesc>
> <parameters>
> 	<parameter name="volgroup" required="1" unique="1">
> 	<longdesc lang="en">Name of LVM Volume Group</longdesc>
> 	<shortdesc lang="en">volume group</shortdesc>
> 	<content type="string" default=""/>
> 	</parameter>
> 	<parameter name="symgroup" required="1" unique="1">
> 	<longdesc lang="en">Name of Symmetrix Replication Group</longdesc>
> 	<shortdesc lang="en">symmetrix group</shortdesc>
> 	<content type="string" default=""/>
> 	</parameter>
> 	<parameter name="sympath" required="1" unique="1">
> 	<longdesc lang="en">Binary Path of Symmetrix CLI Tools</longdesc>
> 	<shortdesc lang="en">symcli binary path</shortdesc>
> 	<content type="string" default="/opt/emc/SYMCLI/V6.5.1/bin"/>
> 	</parameter>

Declaring required="1" and at the same time specifying a default is
self-contradicting. But you probably want to remove that default if it
must include a version number. unique="1" for sympath is definitely
wrong, as far as I can tell.

> </parameters>
> <actions>
> <action name="start"   timeout="240s" />

Wow. It can really take up to 4 minutes to start this?

> <action name="stop"    timeout="30s" />
> <action name="monitor" timeout="60s" interval="60s" />

And a monitor action can take up to a minute to complete?

> <action name="meta-data" timeout="5" />
> <action name="validate-all" timeout="30" />
> </actions>
> </resource-agent>
> EOF
> echo;
> }
> 
> myusage() {

You may want to rename your function names to something sensible too,
but that's a technicality.

> 	cat <<EOF
> usage: $0 {start|stop|monitor|validate-all|meta-data}
> 
> Expects to have a fully populated OCF RA-compliant environment set.
> EOF
> echo;
> }
> 
> mystart() {
> 	echo 'Starting SRDF replication switch for' "${OCF_RESKEY_symgroup}" 'on this node...';

Any particular reason for all these echos, instead of using ocf_log?

> 	vgdisplay -v "${OCF_RESKEY_volgroup}" 2>&1 | grep -q 'Volume group.*not found';
> 	if [ $? -ne 0 ]; then
> 		echo 'Volume group' "${OCF_RESKEY_volgroup}" 'already active.';
> 		symrdf -g "$OCF_RESKEY_symgroup" query | grep -q '^DG.*Type.*RDF1';
> 		if [ $? -eq 0 ]; then
> 			echo 'and symmetrix mirror underneath in replication read-write mode.';
> 			exit $OCF_SUCCESS;
> 		fi;
> 		echo 'but symmetrix mirror underneath only in *replicated* read-only mode.';
> 	fi;

I can only agree with Lars here: kick all this LVM stuff out, and have
the LVM RA take care of that.

> 	## recover from previous optional crash
> 	symcfg discover;
> 
> 	## check if already primary
> 	symrdf -g "$OCF_RESKEY_symgroup" query >/dev/null;
> 	if [ $? -ne 0 ]; then
> 		echo 'symmetrix group' "${OCF_RESKEY_symgroup}" 'non existant';
> 		exit $OCF_ERR_ARGS;

That should probably be $OCF_ERR_CONFIGURED. And shouldn't this check go
into validate?

> 	fi;
> 
> 	symrdf -g "$OCF_RESKEY_symgroup" query | grep -q '^DG.*Type.*RDF1';

Doing the same exact same call here twice, first discarding its output
and then analyzing it, seems odd.

> 	if [ $? -ne 0 ]; then
> 		## initiating 3 steps: failover, swap, establish.
> 		symrdf -noprompt -g "$OCF_RESKEY_symgroup" failover;
> 		if [ $? -ne 0 ]; then
> 			echo;
> 			echo 'Unable to failover symmetrix group ' "${OCF_RESKEY_symgroup}" '!';
> 			exit $OCF_ERR_GENERIC;
> 		fi;
> 
> 		symrdf -noprompt -g "$OCF_RESKEY_symgroup" swap;
> 		if [ $? -ne 0 ]; then
> 			echo;
> 			echo 'Unable to swap symmetrix group ' "${OCF_RESKEY_symgroup}" '!';
> 			exit $OCF_ERR_GENERIC;
> 		fi;
> 
> 		symrdf -noprompt -g "$OCF_RESKEY_symgroup" establish;
> 		if [ $? -ne 0 ]; then
> 			echo;
> 			echo 'Unable to establish symmetrix group ' "${OCF_RESKEY_symgroup}" '!';
> 			exit $OCF_ERR_GENERIC;
> 		fi;
> 
> 		## query if fail-over really succeeded.
> 		symrdf -g "$OCF_RESKEY_symgroup" query | grep -q '^DG.*Type.*RDF1';
> 		if [ $? -ne 0 ]; then
> 			echo;
> 			echo 'Symmetrix group' "${OCF_RESKEY_symgroup}" 'still not active.';
> 			exit $OCF_ERR_GENERIC;
> 		fi;
> 	else echo 'Symmetrix Group' "${OCF_RESKEY_symgroup}" 'already active.';
> 	fi;

Hmm. Most RAs do a monitor first. Then only if they detect that the
resource isn't active yet, they start it. Any particular reason why
instead of invoking monitor, you're duplicating the code?

> 	## detect block device status change
> 	for dev in `syminq|awk '$2~/R1/{print $1}'|cut -d/ -f3`; do
> 		echo 1 > /sys/block/${dev}/device/rescan;
> 	done;
> 
> 	## restart multipath daemon to detect new block devices.
> 	## TODO: switch to reload or specific device names?
> 	/etc/init.d/multipathd restart;
> 	if [ $? -ne 0 ]; then
> 		echo 'Unable to restart multipath' "${OCF_RESKEY_symgroup}" 'still not active.';
> 		exit $OCF_ERR_GENERIC;
> 	fi;

Again, you're touching upon something that's outside the realm of your
RA. Are you absolutely certain there is no other way to do this? Because
if there is, you'd best kick this multipath handling out and leave it to
users to deal with multipathd separately. If that is at all applicable
-- what if users choose not to use dm-multipath and instead want go with
some built-in multipathing their HBAs may support? Or, for
whatever reason, not use any multipathing at all?

> 	sleep 2;

Are you absolutely certain you need this unconditional "sleep" here? I
presume you're doing it in order to wait for udev events to complete
that were triggered after your multipathd restart. If that is so, you
should really check for some udev path you are expecting to appear, and
sleep only if your expectation isn't matched. That being said, again, I
think you ought to ditch all this multipath handling, at which point
this "sleep" is probably moot as well.

> 	return $OCF_SUCCESS;
> }
> 
> mystop() {
> 	echo "Stopping MPIO Device to prevent access..."
> 
> 	## lvm check is just for getting our mpio device name.
> 	MPIODEVICE=`vgdisplay -v "${OCF_RESKEY_volgroup}" 2>/dev/null 2>&1`;
> 	sleep 1;
> 	MPIODEVICE=`vgdisplay -v "${OCF_RESKEY_volgroup}" 2>/dev/null | awk '/PV Name/{print $3}'`;
> 	if [ "${MPIODEVICE}DUMMY" = "DUMMY" ]; then
> 		echo 'MPIO device that belonged to volume group' "${OCF_RESKEY_volgroup}" 'already stopped.';
> 		return $OCF_SUCCESS;
> 	fi;
> 	echo $MPIODEVICE;
> 
> 	MPIODEVICE=`echo "${MPIODEVICE}" | sed 's!^/dev/!!'`;
> 	if [ "${MPIODEVICE}DUMMY" = "DUMMY" ]; then
> 		echo 'Unable to trim path from MPIO device' "${MPIODEVICE}" '!';
> 		exit $OCF_ERR_GENERIC;
> 	fi;
> 	echo $MPIODEVICE;
> 
> 	MPIONAME=`multipath -ll | awk '/'${MPIODEVICE}'/{print $1}'`;
> 	if [ "${MPIONAME}DUMMY" = "DUMMY" ]; then
> 		echo 'Unable to determine MPIO name that belongs to MPIO device ' "${MPIODEVICE}" '!';
> 		exit $OCF_ERR_GENERIC;
> 	fi;
> 	echo $MPIONAME;
> 
> 	## WARNING: multipath doesn't provide sane return codes, checking by another query.
> 	multipath -f "${MPIONAME}";
> 	multipath -ll | grep -q "${MPIONAME}.*${MPIODEVICE}";
> 	if [ $? -eq 0 ]; then
> 		echo 'Unable to remove MPIO device ' "${MPIONAME}" ':' "${MPIODEVICE}" '!';
> 		exit $OCF_ERR_GENERIC;
> 	fi;
> 
> 	return $OCF_SUCCESS;
> }

Now that's a lot of LVM and multipath handling that probably needs to
go. Curiously, I'm not seeing any invocations of symrdf. What are you
stopping?

> mymonitor() {
> 	## WARNING: the stop operation itself is NOT switching back from SRDF1 to SRDF2!
> 	## this is ONLY done by the start operation on the active node.

Ah, I guess this answers my question above. So "stop" is a no-op in fact?

> 	## therefore stopped status means the VOLUME GROUP is NOT active in this node.

And checking for that is already implemented in the LVM RA. If that RA
doesn't do what you want it to do, please fix it; you may help others as
well (me included, probably, as I use it often). Don't duplicate its
functionality elsewhere.

> 	MPIODEVICE=`vgdisplay -v "${OCF_RESKEY_volgroup}" 2>/dev/null 2>&1`;
> 	sleep 1;
> 	MPIODEVICE=`vgdisplay -v "${OCF_RESKEY_volgroup}" 2>/dev/null | awk '/PV Name/{print $3}'`;

Do something, forget about it, sleep one second, then do the same thing
again? Interesting. Well, looks like duplication of LVM RA functionality
anyway. Can probably go.

> 	if [ "${MPIODEVICE}DUMMY" = "DUMMY" ]; then
> 		echo 'MPIO device that belonged to volume group' "${OCF_RESKEY_volgroup}" 'already stopped.';
> 		return $OCF_NOT_RUNNING;
> 	fi;
> 
> 	symrdf -g "$OCF_RESKEY_symgroup" query >/dev/null;
> 	if [ $? -ne 0 ]; then
> 		echo 'symmetrix group' "${OCF_RESKEY_symgroup}" 'non existant.';
> 		exit $OCF_NOT_RUNNING;
> 	fi;
> 
> 	symrdf -g "$OCF_RESKEY_symgroup" query | grep -q '^DG.*Type.*RDF1';
> 	if [ $? -ne 0 ]; then
> 		echo 'symmetrix group' "${OCF_RESKEY_symgroup}" 'not active.';
> 		exit $OCF_NOT_RUNNING;
> 	fi;

Do I assume correctly that this last one is the check that reliably
queries the status of replication? Then leave that in and remove the
rest. Including the test immediately before, which IIUC belongs in
validate, not stop.

> 
> 	return $OCF_SUCCESS;
> }
> 
> myvalidate() {
> 	## check if all variables are there.
> 	if [ "${OCF_RESKEY_volgroup}DUMMY" = "DUMMY" ]; then
> 		myusage;
> 		echo;
> 		echo 'Missing mandatory parameter $OCF_RESKEY_volgroup';
> 		exit $OCF_ERR_ARGS;
> 	fi
> 	if [ "${OCF_RESKEY_symgroup}DUMMY" = "DUMMY" ]; then
> 		myusage;
> 		echo;
> 		echo 'Missing mandatory parameter $OCF_RESKEY_symgroup';
> 		exit $OCF_ERR_ARGS;
> 	fi
> 	if [ "${OCF_RESKEY_sympath}DUMMY" = "DUMMY" ]; then
> 		myusage;
> 		echo;
> 		echo 'Missing mandatory parameter $OCF_RESKEY_sympath';
> 		exit $OCF_ERR_ARGS;
> 	fi

You're already declaring this to be a bash agent; there's much simpler
ways to do this in bash. If however you want this RA to run in non-bash
shells (which is laudable), I'd suggest to declare it as #!/bin/sh and
run it through your favorite bashism remover.

> 	## check if all cli tools are there.
> 	for TOOL in awk multipath vgdisplay symrdf symcfg grep; do
> 		which "${TOOL}" 2>&1 | grep -q '^which: no ';
> 		if [ $? -eq 0 ]; then
> 			myusage;
> 			echo;
> 			echo 'The command |'"${TOOL}"'| was not found in the current path.';
> 			exit $OCF_ERR_ARGS;
> 		fi;
> 	done;

Any reason for not using check_binary?

And (again, EMC amateur behind the keyboard), shouldn't there be some
check in validate to figure out whether the Device Group has Dynamic RDF
enabled? Or does that requirement (for being able to do "symrdf swap")
no longer exist?

Cheers,
Florian




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : http://lists.linux-ha.org/pipermail/linux-ha-dev/attachments/20090713/f6651ae6/attachment.pgp 


More information about the Linux-HA-Dev mailing list