[Linux-ha-dev] Heartbeat 0.45 experiences
Thomas Hepper
th@ant.han.de
Mon, 18 Oct 1999 22:51:24 +0200
Hi,
On Mon, Oct 18, 1999 at 01:28:18PM -0700, Steve Beattie wrote:
>
> 2. If heartbeat has failed over to the backup machine, and then the
> heartbeat on the backup machine is cleanly stopped, it keeps the
> resource even though it claims to have relinquished it (i.e. it
> still has the IP address it took over from the original host).
Same here. Still looking for some more debug output
>
> 4. Situation: two machines, "good" and "bad". Bad has a failing disk, which
> corrupts a few files on its filesystem. Good is the primary, bad is
> the backup. On startup, good does not successfully grab the resource.
> However, killing the heartbeat on good causes bad to successfully
> take over. Restarting the heartbeat on good causes bad to relinquish,
> but again good unsuccessfully attempts to take the resource.
>
> Here's the typical sort of log on good:
>
> heartbeat: 1999/10/14_15:24:53 info: ***********************
> heartbeat: 1999/10/14_15:24:53 info: Configuration validated. Starting heartbeat.
> heartbeat: 1999/10/14_15:24:53 notice: UDP heartbeat started on port 1001 interface eth0
> heartbeat: 1999/10/14_15:24:53 error: Cannot open /proc/ha/.control: No such file or directory
> heartbeat: 1999/10/14_15:24:59 warn: node bad.int.wirex.com: is dead
> heartbeat: 1999/10/14_15:24:59 INFO: Running /etc/ha.d/rc.d/status status
>
> and then nothing.
Yup the same here ...
This happens here after an longer uptime. After the start on both nodes
an killing and starting of heartbeast on the master works as expected,
next day the slave will take the resources when stopping heartbeat on the
master, but it will not release it after starting the master. The result
than is that both nodes have the ip address.
So the question is, how to debug this ?
Thomas
--
-----------------------------------------------
| Thomas Hepper th@ant.han.de |
| ( If the above address fail try ) |
| ( thomas.hepper@planet-interkom.de) |
-----------------------------------------------