[Linux-HA] Newbie Questions on Heartbeat Startup
dejanmm at fastmail.fm
Thu Dec 6 04:39:05 MST 2007
I'm sure that I replied to this one, but...
On Thu, Dec 06, 2007 at 09:32:20AM +0100, Andrew Beekhof wrote:
> On Nov 30, 2007, at 7:39 PM, Art Age Software wrote:
>> I'm setting up my first heartbeat cluster. (I have managed one in the
>> past, but never set one up from scratch before.) It is going well, but
>> I have a few questions:
>> 1) In the log, the following sometimes appears during initial
>> heartbeat startup, and I have no idea what it means:
>> heartbeat: : ERROR: ha_msg_addraw_ll: illegal field
>> heartbeat: : ERROR: ha_msg_addraw(): ha_msg_addraw_ll failed
>> heartbeat: : ERROR: NV failure (string2msg_ll):
>> heartbeat: : ERROR: Input string: [>>> t=NS_ackmsg >>> t=status
>> st=up dt=7d00 protocol=1 src=db1 (1)srcuuid=+yf5W+NTRWi9QYzh4ZzsPg==
>> seq=5 hg=474f3bee ts=475050f3 ld=0.59 0.15 0.05 2/148 4958 ttl=4
>> auth=1 <<< ]
>> heartbeat: : ERROR: sp=>>> t=status st=up dt=7d00 protocol=1
>> src=db1 (1)srcuuid=+yf5W+NTRWi9QYzh4ZzsPg== seq=5 hg=474f3bee
>> ts=475050f3 ld=0.59 0.15 0.05 2/148 4958 ttl=4 auth=1 <<<
>> heartbeat: : ERROR: depth=0
>> heartbeat: : ERROR: MSG: Dumping message with 1 fields
>> heartbeat: : ERROR: MSG : [t=NS_ackmsg]
> that doesn't look good at all
> what version are you running?
> if its a recent one, i'd recommend reporting a bug
Recently somebody had a same problem which turned out to be a
ha.cf setting (setting message format to netstring). Anyway, it
is a communication problem: channel not clear (serial?) or
similar. Please post ha.cf.
>> 2) In the log, the broadcast port appears to be opened and then
>> immediately closed. Does this mean the port was not initialized
>> heartbeat: : info: glib: UDP Broadcast heartbeat started on port
>> 694 (694) interface
>> heartbeat: : info: glib: UDP Broadcast heartbeat closed on port
>> 694 interface - Status: 1
> dont know, sorry
No, this is OK.
>> 3) I have defined a ping_group with 2 ping nodes using ipfail. If the
>> active cluster nodes can only see one of the ping nodes, and the
>> backup cluster node can see both ping nodes, then heartbeat initiates
>> a failover to the backup node. Is this correct behavior? According to
>> the docs, "The ability to communicate with any of the group members
>> means that the group-name member is reachable." I interpreted this to
>> mean that as long as one ping node in the group is active, the cluster
>> would be considered stable. But in fact, heartbeat seems to favor the
>> node with "better connectivity."
> i'm not familiar with how ipfail works
ipfail is v1. Further down you mention constraints. Which config
style do you run: v1 or v2?
>> 4) Is there a way to make a resource run on one and only one node (and
>> not failover if the node goes down)? I want to set up constraints such
>> (i) Resource "A" favors node "1" but can run on node "2" if necessary.
>> (ii) Resource "B" can only run on node "2"
>> (iii) Resource "A" and "B" may **not** run on the same node, and
>> resource "A" has priority. So, if node "1" goes down, resource ""B"
>> will be stopped and resource "A" will migrate to node "2".
>> Any way to accomplish that?
> as far as i know, you need version 2 with the crm enabled to do that
Right, v1 won't do. With v2 it's:
- assign higher scores to A compared to B
- colocate A and B with -INFINITY
>> Thanks much in advance for any help.
>> Linux-HA mailing list
>> Linux-HA at lists.linux-ha.org
>> See also: http://linux-ha.org/ReportingProblems
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA