[Linux-HA] Strange behaviour in Heartbeat ?
Fabrice Durand
durand.fabrice at gmail.com
Thu Aug 11 09:22:23 MDT 2005
Hello all,
First, thank you for your answers about the "split-brain" situation.
Here is a new problem. I have tested a 2 node active/passive cluster with
Heartbeat_1.2.3-1woody.
The haresource file is :
EEPCLU1 EEPSERV1 apache
MailTo::toto at toto.fr::ServiceApache<:toto at toto.fr::ServiceApache>
The 2 nodes broadcast heartbeats via 2 ports eth0 ("external LAN) and eth1
(direct connexion
between the 2 nodes).
To check the connection of each node in the external LAN,
ipfail is active with a ping towards a third machine, as
specified in ha.cf <http://ha.cf> file :
respawn hacluster /usr/lib/heartbeat/ipfail
ping theThirdMachine
Before performing the tests, we have the initial situation :
- Heartbeat is running on both nodes,
- one service (apache) is running on node 1,
- no service is running on node 2,
as specified the haresource file :
node1 addrIpServ1 apache
Now I simply cut the connection between node 1 and the third machine...
Then there is a strange behaviour :
- first, node 2 wants to go standby while he already has no service and is
now supposed to acquire resources of node 1
- node 1 is trying to start Apache, while Apache is already running on node
1 and node 1 is supposed to shutdown Apache.
Luckily there is a failure in this starting Apache.
- Even if the starting Apache on node 1 has failed, node 1 is successfully
sending a mail to say that Apache has just started on node 1 !
- in the same time, node 2 is trying to shutdown Apache, while no process
Apache is running on node 2. Then no process is killed.
- A few seconds later, the expected behaviour is happening : node 1 is
stopping Apache and get rid of the associated logical IP address EEPSERV1;
and then node 2 is taking back the logical IP address and starting Apache.
Do you know if this behavior is correct, if there should be a transitory
phase before the failing over ?
And if there is a manner to prevent Heartbeat to send an email saying that
Apache has started, when
the starting Apache has failed ?
Thanks a lot,
Regards,
Fabrice
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.community.tummy.com/pipermail/linux-ha/attachments/20050811/b54d0d7c/attachment.html
More information about the Linux-HA
mailing list