[Linux-HA] Regarding split brain
Preeti Jain
Preeti_8644 at yahoo.com
Thu Dec 9 20:36:05 MST 2010
Hello list,
I am testing network failure case by removing nic cable on one node and getting
unwanted outcomes as whole cluster gets disturbed and resource appears to move
on different nodes until it gets stabled on one node and it is also resulting in
failback.
Like if i remove nic cable from node 1 then failover happens it takes some time
to move to node 2 but when once again i plugin cable on node 1 a kind of split
brain happens and resource take sometime to get stabled on node 1 resulting
failback which is again not desired as it should stay on node 2...
Every node says like other cluster nodes coming after partition
part of log file on node 1 after nic plugin
heartbeat[2521]: 2010/12/08_16:50:02 CRIT: Cluster node Node2 returning after
partition.
heartbeat[2521]: 2010/12/08_16:50:02 info: For information on cluster
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:02 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:02 info: See FAQ for information on tuning
deadtime.
heartbeat[2521]: 2010/12/08_16:50:02 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:02 info: Link Node2:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:02 WARN: Late heartbeat: Node Node2: interval
781870 ms
heartbeat[2521]: 2010/12/08_16:50:02 info: Status update for node Node2: status
active
heartbeat[2521]: 2010/12/08_16:50:03 info: Link Node3:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:03 info: Link Node4:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:03 CRIT: Cluster node Node4 returning after
partition.
heartbeat[2521]: 2010/12/08_16:50:03 info: For information on cluster
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:03 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:03 info: See FAQ for information on tuning
deadtime.
heartbeat[2521]: 2010/12/08_16:50:03 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:03 WARN: Late heartbeat: node Node4: interval
782200 ms
heartbeat[2521]: 2010/12/08_16:50:03 info: Status update for node Node4: status
active
heartbeat[2521]: 2010/12/08_16:50:03 info: Link Node5:eth0 up.
heartbeat[2521]: 2010/12/08_16:50:04 CRIT: Cluster node Node2 returning after
partition.
heartbeat[2521]: 2010/12/08_16:50:04 info: For information on cluster
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:04 info: See FAQ for information on tuning
deadtime.
heartbeat[2521]: 2010/12/08_16:50:04 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Late heartbeat: node Node2: interval
784380 ms
heartbeat[2521]: 2010/12/08_16:50:04 info: Status update for node Node2: status
active
heartbeat[2521]: 2010/12/08_16:50:04 CRIT: Cluster node Node5 returning after
partition.
heartbeat[2521]: 2010/12/08_16:50:04 info: For information on cluster
partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Deadtime value may be too small.
heartbeat[2521]: 2010/12/08_16:50:04 info: See FAQ for information on tuning
deadtime.
heartbeat[2521]: 2010/12/08_16:50:04 info: URL: http://linux-
ha.org/FAQ#heavy_load
heartbeat[2521]: 2010/12/08_16:50:04 WARN: Late heartbeat: node Node5: interval
784390 ms
heartbeat[2521]: 2010/12/08_16:50:04 info: Status update for node Node5: status
active
Any solution for this problem...
Regards
Preeti
More information about the Linux-HA
mailing list