[Linux-HA] Is this heartbeat behaviour correct ?
Alan Robertson
alanr at unix.sh
Fri Aug 5 22:37:35 MDT 2005
Boris Berger wrote:
> Thanks for your answer. Are these particular settings in the ha.cf file
> correct ?
>
> # Both nodes broadcast on both network cards (here : eth0 and eth1)
> bcast eth1,eth0
>
> ## communication port : 694 : the nodes broadast towards the whole network
> !
> udpport 694
> # Use port 694 for bcast or ucast communications . This is the default
> # port and the official one registered at the IANA, organisation responsible
> # for assigning new IP addresses
>
> Is there something to change in there or elsewhere ? Here is our complete
> ha.cf file :
>
> bcast eth1,eth0
> debugfile /var/log/ha-debug
> logfile /var/log/ha-log
> logfacility local0
>
> keepalive 2
> deadtime 10
> warntime 6
> initdead 60
>
> udpport 694
>
> node EEPCLU1
> node EEPCLU2
>
> auto_failback on
>
> respawn hacluster /usr/lib/heartbeat/ipfail
> ping EEPNFS
>
>
> Thanks
>
> ---------- Initial Header -----------
>
>>From : linux-ha-bounces at lists.linux-ha.org
> To : "General Linux-HA mailing list"
> linux-ha at lists.linux-ha.org
> Cc :
> Date : Fri, 05 Aug 2005 08:17:53 -0600
> Subject : Re: [Linux-HA] Is this heartbeat behaviour correct ?
>
> Boris Berger wrote:
>>Hello all,
>>
>>I have tested a 2 node active/passive Heartbeat cluster.
>>To check the connection of each node in the external network,
>>ipfail is active with a ping towards a third machine, as
>>specified in ha.cf file :
>>respawn hacluster /usr/lib/heartbeat/ipfail
>>ping theThirdMachine
>>
>>Before performing the tests, we have the initial situation :
>>- Heartbeat is running on both nodes,
>>- one service (apache) is running on node 1,
>>- no service is running on node 2,
>>as specified the haresource file :
>>node1 addrIpServ1 apache
>>
>>Now I cut simultaneously :
>>- the direct connection between the 2 nodes
>>- the connection between node 1 and the third machine
>>- the connection between node 2 and the third machine
>>
>>Then, one can notice in the log that :
>>- Apache does not stop on node 1
>>- Apache start on node 2.
>>So Apache is now running on both nodes.
>>
>>Now, if I reestablish :
>>- EITHER the connection between node 1 and the third machine ONLY
>>- OR the connection between node 2 and the third machine ONLY
>>then nothing special is happening, so Apache is still running on both
> nodes.
>>Do you know is this is a normal behaviour ? And how can this be explained
> ?
>
> It can most probably be explained as a multiple failure you haven't
> configured heartbeat to deal with. In other words, a configuration error.
>
> When you restore the direct connection (the only one you are
> heartbeating over, I strongly suspect), it will restart heartbeat on
> both sides.
>
> If you want that to work, you need to tell heartbeat to send heartbeats
> over all (both?) interfaces - not just the direct connection.
I actually didn't think that , was valid as a separator. But, If it
didn't give you an error, then I guess it must be OK. But, maybe it's
not...
Could you try it again with a space or tab instead of the "," (comma)?
--
Alan Robertson <alanr at unix.sh>
"Openness is the foundation and preservative of friendship... Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
More information about the Linux-HA
mailing list