[Linux-HA] Some Newbie Questions
dejanmm at fastmail.fm
Tue Dec 4 16:04:43 MST 2007
On Tue, Dec 04, 2007 at 10:33:45AM -0800, Art Age Software wrote:
> I posted some questions to the list earlier, but I'm not sure it was
> received, as I haven't seen any responses.
> So. I am posting again. My apologies if this is a duplicate posting.
> Just trying to find some assistance...
> I'm setting up my first heartbeat cluster. (I have managed one in the
> past, but never set one up from scratch before.) It is going well, but
> I have a few questions:
> 1) In the log, the following sometimes appears during initial
> heartbeat startup, and I have no idea what it means:
> heartbeat: : ERROR: ha_msg_addraw_ll: illegal field
> heartbeat: : ERROR: ha_msg_addraw(): ha_msg_addraw_ll failed
> heartbeat: : ERROR: NV failure (string2msg_ll):
> heartbeat: : ERROR: Input string: [>>> t=NS_ackmsg >>> t=status
> st=up dt=7d00 protocol=1 src=db1 (1)srcuuid=+yf5W+NTRWi9QYzh4ZzsPg==
> seq=5 hg=474f3bee ts=475050f3 ld=0.59 0.15 0.05 2/148 4958 ttl=4
> auth=1 <<< ]
> heartbeat: : ERROR: sp=>>> t=status st=up dt=7d00 protocol=1
> src=db1 (1)srcuuid=+yf5W+NTRWi9QYzh4ZzsPg== seq=5 hg=474f3bee
> ts=475050f3 ld=0.59 0.15 0.05 2/148 4958 ttl=4 auth=1 <<<
> heartbeat: : ERROR: depth=0
> heartbeat: : ERROR: MSG: Dumping message with 1 fields
> heartbeat: : ERROR: MSG : [t=NS_ackmsg]
A communication problem. Can you post your ha.cf?
> 2) In the log, the broadcast port appears to be opened and then
> immediately closed. Does this mean the port was not initialized
> heartbeat: : info: glib: UDP Broadcast heartbeat started on port
> 694 (694) interface
> heartbeat: : info: glib: UDP Broadcast heartbeat closed on port
> 694 interface - Status: 1
> 3) I have defined a ping_group with 2 ping nodes using ipfail. If the
> active cluster nodes can only see one of the ping nodes, and the
> backup cluster node can see both ping nodes, then heartbeat initiates
> a failover to the backup node. Is this correct behavior? According to
> the docs, "The ability to communicate with any of the group members
> means that the group-name member is reachable." I interpreted this to
> mean that as long as one ping node in the group is active, the cluster
> would be considered stable. But in fact, heartbeat seems to favor the
> node with "better connectivity."
A ping_group should behave like an entity. Do you have logs?
> 4) Is there a way to make a resource run on one and only one node (and
> not failover if the node goes down)? I want to set up constraints such
> (i) Resource "A" favors node "1" but can run on node "2" if necessary.
> (ii) Resource "B" can only run on node "2"
> (iii) Resource "A" and "B" may **not** run on the same node, and
> resource "A" has priority. So, if node "1" goes down, resource ""B"
> will be stopped and resource "A" will migrate to node "2".
> Any way to accomplish that?
Give A higher scores than B and create a colocation constraint
which says that A and B can't run on the same node (-INFINITY).
BTW, constraints are v2 and ipfail v1. Which do you use?
> Thanks much in advance for any help.
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA