[Linux-ha-dev] tracking resource groups in heartbeat
alanr at suse.com
Wed Mar 29 06:46:24 MST 2000
> On Tue, Mar 28, 2000 at 09:35:08PM -0700, Alan Robertson wrote:
> > The conversation below is taken from email off the list. It seemed
> > generally interesting though...
> > Horms wrote:
> > > ... I have however noticed that as heartbeat keeps state of nodes, and
> > > not resource allocations it is possible to get into a state where no
> > > nodes/more than one node have a resource. In particular if there is a
> > > communication medium failure, or if heartbeat is started up on more
> > > than one node simultaneously. I have been thinking of some fairly
> > > simple mechanisms to resolve this, vis a vis nodes requesting ownership
> > > of a resource. I am wondering what your thoughts are. I am most
> > > concerned about the (simple) two-node case, though something that
> > > extends beyond that would be nice.
> > The folks from Conectiva are doing something in a related area. In the
> > current code, the assumption is that if the master for a resource is up,
> > it has control of the resources it is listed as master for. They break
> > that assumption with a new feature (nice_failover?). It would be good to
> > add your thoughts and observations to that, and think about the right way
> > of thinking about this stuff. Once one has the right mental model, the
> > code is easy :-)
> It seems to me that the existing code will take control of a resource if
> the master specified in haresources fails, but not necessarily give it up
> when the master comes back up again.
Without nice_failback (which isn't in the current code), this should not
happen. When the master comes back up, it asks for the other node to
give it's resources, and in the case of no response takes them anyway.
> Again, in the case of a media failure,
> or nodes coming up at the same time both nodes may take ownership of a
> resource and neither will give it up.
I agree in the case of media failure. Have you observed it in the case
of both nodes coming up at the same time?
The current bringup sequence is: Start your own heartbeat. Wait until
you've heard someone else's heartbeat or about 10 seconds. Begin the
resource takeover sequence for those resources you master. If you've
heard someone else's heartbeat, then communications with the other end
are working. I think the problem right now is that there is no database
indicating the state of either resources or nodes. Everything depends
on the resource scripts to indicate resource status.
Here's what I think the race condition might be: Node A is master and
is down. Node B is also down, but is the slave. Node B comes up, and
just about the time that it times out on Node A being down, Node A
begins to come up. Node B times out on the resources A is primary on,
and begins the process of taking them over. Node A comes up, and seeing
B's heartbeat, immediately requests it's resources. Node B has started
the takeover scripts, but they aren't done, so it thinks it doesn't own
them, so it doesn't give them up. Node A then takes them over, while
Node B's scripts are in the process of doing the same.
> I have attached a patch that I believe will fix this problem. If
> nice_failover is in operation then this patch will cause both nodes to drop
> the resource, which is bad, but they would both keep it otherwise so it is
> problematic in either case. Also if a resource has more than one master -
> then this patch results in resources being dropped by all nodes or no nodes,
> depending on your haresources file. This isn't very good either but if a
> resource has a master and a slave then it works.
My guess is that we need to design a "good" bringup algorithm that has
the right kinds of sequencing and status changes such that it doesn't
have any race conditions. This is moderately complex, but is probably
the better approach. I started to write one here, but found it too hard
to write inline in email.
-- Alan Robertson
alanr at suse.com
More information about the Linux-HA-Dev