[Linux-ha-dev] Re: "failover is too fast"
Alan Robertson
alanr@suse.com
Tue, 14 Nov 2000 09:59:00 -0700
Dan Yocum wrote:
>
> Alan,
>
> It looks like a controlled failover (i.e., 'heartbeat stop') is mostly
> working in 0.4.8g - the second node doesn't start taking over the
> services until the last service is told to stop. However, what I'm
> observing is that hb issues the '<service> stop' command, and
> immediately dies, itself, without waiting/verifying that the last
> service has actually died, or do I have this thing
Here's what I get with a "sleep" resource that sleeps 45 seconds before
finishing:
Nov 14 10:50:07 sgi1 heartbeat[460]: info: Heartbeat shutdown in progress.
Nov 14 10:50:07 sgi1 heartbeat[587]: info: Giving up all HA resources.
Nov 14 10:50:07 sgi1 heartbeat: info: Releasing resource group: sgi1 Sleep
Nov 14 10:50:07 sgi1 heartbeat: info: Running /etc/ha.d/resource.d/Sleep
stop
Nov 14 10:50:07 sgi1 heartbeat: debug: Starting /etc/ha.d/resource.d/Sleep
stop
Nov 14 10:50:07 sgi1 heartbeat: info: /etc/ha.d/resource.d/Sleep: Shutting
down
{and it sleeps for 45 seconds with no effects on sgi2}
Nov 14 10:50:52 sgi1 heartbeat: info: /etc/ha.d/resource.d/Sleep: Shutdown
complete.
Nov 14 10:50:52 sgi1 heartbeat: debug: /etc/ha.d/resource.d/Sleep stop
done. RC=0
Nov 14 10:50:52 sgi1 heartbeat[587]: info: All HA resources relinquished.
Nov 14 10:50:52 sgi2 heartbeat: info: Running /etc/ha.d/rc.d/shutdone
shutdone
{this message mostly ignored by sgi2 - it marks sgi1 as in
transition}
Nov 14 10:50:53 sgi1 heartbeat[460]: info: Heartbeat shutdown complete.
Nov 14 10:50:56 sgi2 heartbeat[813]: WARN: node sgi1: is dead
{only now does sgi2 notice anything about sgi1}
So, it looks like it works to me...
-- Alan Robertson
alanr@suse.com