[Linux-HA] heartbeat 1.2.3,
ipfail will not lead to an failover if ping is not reachable
Patrick Roßbach
patrick.rossbach at alcatel.de
Fri Aug 19 09:33:53 MDT 2005
Hi all high availables,
I've a question about ipfail's ping:
I thought I can use the 'ping' option to observe availability of a
connection to a node heartbeat is not running on. And if the active
server is not able to reach that node anymore, but the passive one is, a
fail over will happen. I thought I've seen this behavior in the past,
but when I try out today (and yesterday) there was no fail over.
I tried to unplug the wire on NIC of my active server and also to lock
the interface (eth2) using 'iptables' (for 20 secs). The servers are
still able to exchanges heartbeats (over ttyS01 and eth3). In the logs I
can see that ipfail detects that the node (rbc_ce0_sw0) is dead, but I
don't know what's missing to get my expected fail over??
Thanks for any help.
Regards,
Patrick
By the way: After all this, it is not possible to end heartbeat commonly
(/etc/init.d/heartbeat stop) without a 'killall ipfail'.
There's the relevant part of syslog: (ERRORs are because of iptables' DROP)
------------------------------------
Aug 19 17:02:39 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:39 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:40 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:40 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:41 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:41 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:42 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:42 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:43 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:43 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:44 oms1 heartbeat: [7148]: WARN: node rbc_ce0_sw0: is dead
Aug 19 17:02:44 oms1 heartbeat: [7148]: info: Link
rbc_ce0_sw0:rbc_ce0_sw0 dead.
Aug 19 17:02:44 oms1 ipfail: [7160]: info: Status update: Node
rbc_ce0_sw0 now has status dead
Aug 19 17:02:44 oms1 heartbeat: [16237]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Aug 19 17:02:44 oms1 heartbeat: info: Running /etc/ha.d/rc.d/status status
Aug 19 17:02:44 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:44 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:45 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:45 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:46 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:46 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:47 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:47 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:48 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:48 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:49 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:49 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:50 oms1 heartbeat: [7148]: info: all clients are now resumed
Aug 19 17:02:50 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:50 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:50 oms1 ipfail: [7160]: info: NS: We are dead. :<
Aug 19 17:02:50 oms1 ipfail: [7160]: info: Link Status update: Link
rbc_ce0_sw0/rbc_ce0_sw0 now has status dead
Aug 19 17:02:50 oms1 ipfail: [7160]: info: We are dead. :<
Aug 19 17:02:50 oms1 ipfail: [7160]: info: Asking other side for ping
node count.
Aug 19 17:02:50 oms1 ipfail: [7160]: debug: Message [num_ping] sent.
Aug 19 17:02:51 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:51 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:52 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:52 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:53 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:53 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:54 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:54 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:55 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:55 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:56 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:56 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:57 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:57 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:58 oms1 heartbeat: [7157]: ERROR: glib: Error sending
packet: Operation not permitted
Aug 19 17:02:58 oms1 heartbeat: [7157]: ERROR: write failure on ping
rbc_ce0_sw0.: Operation not permitted
Aug 19 17:02:59 oms1 heartbeat: [7148]: info: Link
rbc_ce0_sw0:rbc_ce0_sw0 up.
Aug 19 17:02:59 oms1 heartbeat: [7148]: WARN: Late heartbeat: Node
rbc_ce0_sw0: interval 21030 ms
Aug 19 17:02:59 oms1 heartbeat: [7148]: info: Status update for node
rbc_ce0_sw0: status ping
Aug 19 17:02:59 oms1 ipfail: [7160]: info: Link Status update: Link
rbc_ce0_sw0/rbc_ce0_sw0 now has status up
Aug 19 17:02:59 oms1 ipfail: [7160]: info: Status update: Node
rbc_ce0_sw0 now has status ping
Aug 19 17:02:59 oms1 ipfail: [7160]: info: A ping node just came up.
Aug 19 17:02:59 oms1 ipfail: [7160]: debug: Found ping node rbc_ce0_sw0!
Aug 19 17:02:59 oms1 ipfail: [7160]: info: Asking other side for ping
node count.
Aug 19 17:02:59 oms1 ipfail: [7160]: debug: Message [num_ping] sent.
ha.cf:
------
logfacility local0
keepalive 1
deadtime 5
warntime 3
initdead 10
udpport 694
baud 19200
serial /dev/ttyS1
bcast eth3
auto_failback off
node oms0
node oms1
ping rbc_ce0_sw0
respawn hacluster /usr/lib/heartbeat/ipfail
haresources:
------------
oms0 IPaddr::192.168.192.1/24 myService mon
--
Patrick Rossbach +-------V-------+ mailto:patrick.rossbach at alcatel.de
Alcatel SEL AG (ext) | A L C A T E L | Phone : +49 30 7002 4742
Colditzstr. 34-36 +---------------+ Fax : +49 30 7002 3669
D-12099 Berlin S E L
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patrick.rossbach.vcf
Type: text/x-vcard
Size: 289 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20050819/9083cb44/patrick.rossbach.vcf
More information about the Linux-HA
mailing list