[Linux-ha-dev] cluster communication on OpenBSD

Sebastian Reitenbach sebastia at l00-bugdead-prods.de
Sun Aug 26 11:30:20 MDT 2007


Hi,

with the heartbeat 2.1.2 port to OpenBSD I was able to setup a cluster 
between two i386 OpenBSD machines, using unicast communication. It also 
worked well between a i386 and a sparc machine. But when I tried to add a 
second ucast statement to the ha.cf file, then the cluster refuses to start 
up:

Aug 26 19:16:50 heartbeat heartbeat: [20236]: ERROR: glib: ucast: error 
binding socket. Retrying: Address already in use

when I switch to multicast communication, then heartbeat also refuses to 
work too:
Aug 26 15:19:21 defiant heartbeat: [23219]: ERROR: write failure on mcast 
fxp0.: Host is down
Aug 26 15:19:23 defiant heartbeat: [23219]: ERROR: glib: Unable to send 
mcast packet [-1]: Host is down

when I switch to broadcast, then I can get the three nodes to work together, 
for a short time. but when I maybe put a node into standby and then active 
again, and relocate some resouces, I start seeing messgages of that sort:

Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: write failure on bcast 
fxp0.: Message too long
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: glib: Unable to send 
bcast [-1] packet(len=1695): Message too long
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: MSG: Dumping message with 
24 fields
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: MSG[0] : [t=cib]
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: MSG[1] : 
[cib_clientid=8c8cc7ff-b425-447a-af46-cf8ade3f4566]
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: MSG[2] : 
[cib_callopt=1048576]
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: MSG[3] : [cib_callid=16]
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: MSG[4] : 
[cib_op=cib_apply_diff]
Aug 26 15:32:46 defiant heartbeat: [10054]: ERROR: MSG[5] : 
[cib_section=status]
...

message to long for broadcast? but unicast works? below my configuration, I 
already use bz2 compression. 
anyone has an idea why I have these cluster communication problems?


autojoin any
crm yes
compression bz2
use_logd on
deadtime 15
initdead 40
keepalive 2
node defiant.ds9 heartbeat.ds9 warbird.ds9
#node defiant.ds9 heartbeat.ds9
#mcast rl0 224.0.0.1 702 1 0
bcast rl0
#ucast rl0 warbird.ds9
#ucast rl0 defiant.ds9
ping 10.0.0.1 10.11.0.1
debug true

kind regards
Sebastian



More information about the Linux-HA-Dev mailing list