[Linux-HA] Is this heartbeat behaviour correct ?

Alan Robertson alanr at unix.sh
Fri Aug 5 22:37:35 MDT 2005


Boris Berger wrote:
> Thanks for your answer. Are these particular settings in the ha.cf file
> correct ?
> 
> # Both nodes broadcast on both network cards (here : eth0 and eth1)
> bcast		eth1,eth0
> 
> ## communication port  : 694 : the nodes broadast towards the whole network
> !
> udpport	694
> # Use port 694 for bcast or ucast communications . This is the default
> # port and the official one registered at the IANA, organisation responsible
> # for assigning new IP addresses
> 
> Is there something to change in there or elsewhere ? Here is our complete
> ha.cf file :
> 
> bcast		eth1,eth0
> debugfile	/var/log/ha-debug
> logfile	/var/log/ha-log
> logfacility	local0
> 
> keepalive	2
> deadtime	10
> warntime	6
> initdead	60
> 
> udpport	694
> 
> node		EEPCLU1
> node		EEPCLU2
> 
> auto_failback	 on
> 
> respawn		hacluster	/usr/lib/heartbeat/ipfail
> ping		EEPNFS
> 
> 
> Thanks
> 
> ---------- Initial Header -----------
> 
>>From      : linux-ha-bounces at lists.linux-ha.org
> To          : "General Linux-HA mailing list"
> linux-ha at lists.linux-ha.org
> Cc          :
> Date      : Fri, 05 Aug 2005 08:17:53 -0600
> Subject : Re: [Linux-HA] Is this heartbeat behaviour correct ?
> 
> Boris Berger wrote:
>>Hello all,
>>
>>I have tested a 2 node active/passive Heartbeat cluster.
>>To check the connection of each node in the external network,
>>ipfail is active with a ping towards a third machine, as
>>specified in ha.cf file :
>>respawn hacluster /usr/lib/heartbeat/ipfail
>>ping theThirdMachine
>>
>>Before performing the tests, we have the initial situation :
>>- Heartbeat is running on both nodes,
>>- one service (apache) is running on node 1,
>>- no service is running on node 2,
>>as specified the haresource file :
>>node1 addrIpServ1 apache
>>
>>Now I cut simultaneously :
>>- the direct connection between the 2 nodes
>>- the connection between node 1 and the third machine
>>- the connection between node 2 and the third machine
>>
>>Then, one can notice in the log that :
>>- Apache does not stop on node 1
>>- Apache start on node 2.
>>So Apache is now running on both nodes.
>>
>>Now, if I reestablish :
>>- EITHER the connection between node 1 and the third machine ONLY
>>- OR the connection between node 2 and the third machine ONLY
>>then nothing special is happening, so Apache is still running on both
> nodes.
>>Do you know is this is a normal behaviour ? And how can this be explained
> ?
> 
> It can most probably be explained as a multiple failure you haven't
> configured heartbeat to deal with.  In other words, a configuration error.
> 
> When you restore the direct connection (the only one you are
> heartbeating over, I strongly suspect), it will restart heartbeat on
> both sides.
> 
> If you want that to work, you need to tell heartbeat to send heartbeats
> over all (both?) interfaces - not just the direct connection.


I actually didn't think that , was valid as a separator.  But, If it 
didn't give you an error, then I guess it must be OK.  But, maybe it's 
not...

Could you try it again with a space or tab instead of the "," (comma)?


-- 
     Alan Robertson <alanr at unix.sh>

"Openness is the foundation and preservative of friendship...  Let me 
claim from you at all times your undisguised opinions." - William 
Wilberforce


More information about the Linux-HA mailing list