[Linux-HA] Groups vs colocations.... etc
Andreas Kurz
akurz at sms.at
Thu Dec 7 10:30:25 MST 2006
Andrew Beekhof wrote:
> On 12/7/06, Andrew Beekhof <beekhof at gmail.com> wrote:
>> On 12/6/06, Andreas Kurz <akurz at sms.at> wrote:
>> > Andrew Beekhof wrote:
>> > > On 11/28/06, Andreas Kurz <akurz at sms.at> wrote:
>> > >> Serge Dubrouski wrote:
>> > >> > Most of clusterware products, at least those that I've worked for
>> > >> > (Veritas VCS, RedHat ClusterSuite, HP ServiceGuard, etc..)
>> consider
>> > >> > resources in a group dependent on each other. Upper resources
>> depend
>> > >> > om lower ones. Like DB depend on Filesystem with data files. That
>> > >> > means that if Filesysten fails DB has to be restarted. And
>> Heartbeat
>> > >> > works exactly like this if you have a group with collocated
>> property
>> > >> > set to "true". Per my understanding it's completely right. If you
>> > >> > don't want that dependency exclude yor NFS filesystem from the
>> group
>> > >> > but add collocated constaint between that group and separate NFS
>> > >> > resource. That might help.
>> > >> >
>> > >> > As for stickiness I personally don't like how it's implemented in
>> > >> > Heartbeat, I'd prefer having a simple property
>> > >> > "number_of_fails_before_failover".
>> > >
>> > > which doesn't in any way affect this scenario (groups) because you've
>> > > still got on part of the group trying to stay where it is and the
>> > > other trying to move. at least with the scoring the CRM gets some
>> > > hint as to which part of the group it should take the most notice of.
>> > >
>> > > there is a comment further down which talks about one resource being
>> > > "buggy"... expecting cluster software to magically compensate for
>> > > inherently broken resources is unrealistic.
>> >
>> > You are right, of course! I only wanted to produce some errors for the
>> > test-scenario ;-)
>> >
>> >
>> > >> eg:
>> > >>
>> > >> a group with 5 resources, 2 nodes
>> > >> location constraint: score 1 for node1, score 10 for node2
>> > >> resource stickiness: 10
>> > >> failure stickiness: 5
>> > >>
>> > >> resource failed over to node1 because of a unexpected server hang of
>> > >> node2, node2 up again (I assume the location scores working
>> correctly
>> > >> ;-) )
>> > >>
>> > >> node1: 5*(1) + 5(10) = 55
>> > >> node2: 5*(10) = 50
>> > >>
>> > >> ok ... resource stays on node1
>> > >>
>> > >> one resource is buggy, heartbeat starts do stop/start it
>> > >>
>> > >> restart1:
>> > >>
>> > >> node1: 5*1 + 5*10 - 1*5 = 50
>> > >> node2: 5*10 = 50
>> > >>
>> > >> ok ... resource stays on node1
>> >
>> > So with equal scores the group is moved away because of the "lower
>> load"
>> > of node2? Is this computed by the number of resources running on
>> each node?
>>
>> right
>>
>> >
>> > >>
>> > >> restart2:
>> > >> node1: 5*1 + 5*10 - 2*5 = 45
>> > >> node2: 5*10 = 50
>> > >>
>> > >> takeover, after 1 local restart, am I right?
>> > >
>> > > you tell me - try ptest and see what it does.
>> >
>> > OK. The group is moved away when either the combined score of node1 is
>> > lower than node2 or if the score for one resource is negative.
>> >
>> > >
>> > >> resource group is on node2,
>> > >
>> > >> failcount reset on node1:
>> > >
>> > > the failcount is never reset automatically
>> >
>> > I did it manually ;-)
>> >
>> > >> node1: 5*1 = 5
>> > >> node2: 5*10 + 5*10 = 100
>> > >>
>> > >> hmm ... thats a problem, or have I missed something?
>> > >
>> > > why is this a problem?
>> > >
>> > >> that would lead to about 20 local restarts before a failover to
>> node1
>> > >> happens ....
>> >
>> > Not so many, but more than on the other node whith the lower scores.
>> The
>> > group fails over when the local score for the failing resource is
>> negative.
>> >
>> > >
>> > > so choose different values
>> > > or dont apply the rsc_location preference to every member of the
>> group
>> >
>> > I tried to configure instance_attributes for the group with different
>> > resource_failure_stickiness values but without success, the rule never
>> > matches:
>> >
>> > ptest[27606]: 2006/12/06_17:30:43 debug: debug2: test_rule:rules.c
>> > Testing rule higher_failure_stickiness_rule
>> > ptest[27606]: 2006/12/06_17:30:43 debug: debug2:
>> test_expression:rules.c
>> > Expression test failed on all ndoes
>> > ptest[27606]: 2006/12/06_17:30:43 debug: debug3: test_rule:rules.c
>> > Expression higher_failure_stickiness_rule/test failed
>> > ptest[27606]: 2006/12/06_17:30:43 debug: debug3:
>> unpack_attr_set:rules.c
>> > Adding attributes from lower_failure_stickiness_inst
>> >
>> >
>> > <instance_attributes id="higher_failure_stickiness_inst" score="100">
>> > <rule id="higher_failure_stickiness_rule" boolean_op="and">
>> > <expression attribute="#uname" operation="eq"
>> > value="sms-nfs-02" id="test"/>
>> > </rule>
>> > <attributes>
>> > <nvpair id="higher_failure_stickiness_id"
>> > name="resource_failure_stickiness" value="-10"/>
>> > </attributes>
>> > </instance_attributes>
>> > <instance_attributes id="lower_failure_stickiness_inst"
>> score="10">
>> > <attributes>
>> > <nvpair id="lower_failure_stickiness_id"
>> > name="resource_failure_stickiness" value="-1"/>
>> > </attributes>
>> > </instance_attributes>
>> >
>> > Andrew, do you have a hint why this is not working? The group is
>> > currently running on the node sms-nfs-02. I tried the same with a time
>> > based rule and it worked.
>>
>> i dont think what you want to do is possible (yet anyway)
>> the mechanism was intended for setting RA properties _after_ we've
>> decided to place it somewhere (ie. on nodeX use NIC=eth1, otherwise
>> use NIC=eth0)
>>
>> so trying to set some variables based on the current location is
>> somewhat more problematic - though i seem to remember it working in
>> the past so maybe i broke something.
>>
>> let me get back to you...
>
> as of this version it should work:
> http://hg.beekhof.net/lha/crm-stable/rev/1045cec0d37d
Thanks! I have done some tests with ptest, and it works .... but only
for primitives and not for a group. Is there a special reason for that?
This were my test instance_attributes:
<instance_attributes id="higher_failure_stickiness_inst" score="100">
<rule id="higher_failure_stickiness_rule" boolean_op="and">
<expression attribute="#uname" operation="eq"
value="sms-nfs-02" id="test"/>
</rule>
<attributes>
<nvpair id="higher_failure_stickiness_id"
name="resource_failure_stickiness" value="-5"/>
</attributes>
</instance_attributes>
<instance_attributes id="lower_failure_stickiness_inst" score="10">
<attributes>
<nvpair id="lower_failure_stickiness_id"
name="resource_failure_stickiness" value="-10"/>
</attributes>
</instance_attributes>
I added that instance_attributes for node based failure_stickiness to
the primitives in my group and recognized the new way of the weight
calculation. So the score for a group is no longer the sum of all resources.
But there is something strange for me. I have a group and these
location_constraints:
<rsc_location rsc="gr_HANFS_01" id="run_HANFS_01">
<rule id="pref_run_HANFS_01" score="10">
<expression attribute="#uname" operation="eq"
value="sms-nfs-01" id="bf7e040f-041e-4971-aaf7-76cc56048be8"/>
</rule>
<rule id="pref_failover_HANFS_01" score="1">
<expression attribute="#uname" operation="eq"
value="sms-nfs-02" id="f3ecf8a0-596b-4aaa-9092-35b6393f8d2b"/>
</rule>
</rsc_location>
The group is running on sms-nfs-02 and the
default-resource-stickiness=15. I added a fail-count=4 for one (NOT the
first in the group) resource and started ptest:
ptest[22366]: 2006/12/07_17:51:29 info: group_print: Resource Group:
gr_HANFS_01
ptest[22366]: 2006/12/07_17:51:29 info: native_print:
rsc_DRBD_drbd1_uploads_sms (heartbeat:drbddisk): Started sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 info: native_print:
rsc_DRBD_drbd2_uploads_portal (heartbeat:drbddisk): Started sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 info: native_print:
rsc_FS_HANFS01-sms_at (heartbeat::ocf:Filesystem): Started sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 info: native_print:
rsc_FS_HANFS01-portal (heartbeat::ocf:Filesystem): Started sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 info: native_print: rsc_nfslock_01
(lsb:HA_nfslock_01): Started sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 info: native_print: rsc_nfs_01
(heartbeat:HA_nfs): Started sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 info: native_print: rsc_IP_HA1
(heartbeat::ocf:IPaddr): Started sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 info: native_print:
rsc_MailAlarm_HANFS_01 (heartbeat::ocf:MailTo): Started
sms-nfs-02
ptest[22366]: 2006/12/07_17:51:29 debug: group_rsc_location: Processing
rsc_location pref_run_HANFS_01 for gr_HANFS_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_DRBD_drbd1_uploads_sms
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd1_uploads_sms + sms-nfs-01 : 10
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd1_uploads_sms + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_DRBD_drbd2_uploads_portal
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd2_uploads_portal + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd2_uploads_portal + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_FS_HANFS01-sms_at
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-sms_at + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-sms_at + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_FS_HANFS01-portal
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-portal + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-portal + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_nfslock_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfslock_01 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfslock_01 + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_nfs_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfs_01 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfs_01 + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_IP_HA1
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_IP_HA1 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_IP_HA1 + sms-nfs-02 : -5
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_run_HANFS_01 (Unknown) to rsc_MailAlarm_HANFS_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_MailAlarm_HANFS_01 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_MailAlarm_HANFS_01 + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: group_rsc_location: Processing
rsc_location pref_failover_HANFS_01 for gr_HANFS_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_DRBD_drbd1_uploads_sms
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd1_uploads_sms + sms-nfs-01 : 10
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd1_uploads_sms + sms-nfs-02 : 16
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_DRBD_drbd2_uploads_portal
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd2_uploads_portal + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_DRBD_drbd2_uploads_portal + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_FS_HANFS01-sms_at
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-sms_at + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-sms_at + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_FS_HANFS01-portal
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-portal + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_FS_HANFS01-portal + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_nfslock_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfslock_01 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfslock_01 + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_nfs_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfs_01 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_nfs_01 + sms-nfs-02 : 15
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_IP_HA1
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_IP_HA1 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_IP_HA1 + sms-nfs-02 : -5
ptest[22366]: 2006/12/07_17:51:29 debug: debug2: native_rsc_location:
Applying pref_failover_HANFS_01 (Unknown) to rsc_MailAlarm_HANFS_01
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_MailAlarm_HANFS_01 + sms-nfs-01 : 0
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: native_rsc_location:
rsc_MailAlarm_HANFS_01 + sms-nfs-02 : 15
....
ptest[22366]: 2006/12/07_17:51:29 debug: debug3: sort_node_weight:
sms-nfs-01 (10) < sms-nfs-02 (16) : weight
....
Hmmm ... so the score from the location constraint is only added to the
first resource in the group. The default-resource-stickiness is added to
every resource in the group. The node specific failure stickiness is
deleted from the resources with fail-counts. But the result of the
group score only includes the values from the first resource in the
group and resources with a negative score do not initiate a failover of
the group like in 2.0.7. So at the moment the failure-stickiness feature
works only for the first resource in a group, is that correct?
Regards,
Andi
>
>>
>> >
>> > Regards,
>> > Andi
>> >
>> > >
>> > >>
>> > >> If I am completely wrong please correct me!
>> > >>
>> > >> Regards,
>> > >> Andreas
>> > >>
>> > >> >
>> > >> > On 11/28/06, Andre van der Vlies <andre at vandervlies.xs4all.nl>
>> wrote:
>> > >> >>
>> > >> >> Andreas Kurz wrote:
>> > >> >> > Andre van der Vlies wrote:
>> > >> >> >> Andrew Beekhof wrote:
>> > >> >> >>>> So, given:
>> > >> >> >>>> IPaddr_1
>> > >> >> >>>> IPaddr_2
>> > >> >> >>>> NFS_1
>> > >> >> >>>> NFS_2
>> > >> >> >>>> PG
>> > >> >> >>>>
>> > >> >> >>>> there's no way I can prevent NFS_2 and PG from being
>> stopped and
>> > >> >> >>>> started
>> > >> >> >>>> if NFS_1 fails, make NFS_1 retry 5 times and if it doesn't
>> > >> >> succeed the
>> > >> >> >>>> whole group needs to failover... :-/
>> > >> >> >>>
>> > >> >> >>> not in a group.
>> > >> >> >>> but groups are only a syntactic shortcut for a bunch of
>> colocation
>> > >> >> and
>> > >> >> >>> ordering constraints.
>> > >> >> >>>
>> > >> >> >>> so dont use a group and dont make NFS_2 depend on NFS_1
>> > >> >> >>>
>> > >> >> >>
>> > >> >> >> Sorry, I still don't get it...
>> > >> >> >>
>> > >> >> >> I've got 5 resources.
>> > >> >> >> I make constraints to start them in the right order (1, 2,
>> 3, 4, 5)
>> > >> >> >> I make constraints to get them start on the same node...
>> > >> >> >
>> > >> >> > That's what a group implies, you don't need to make them 'by
>> hand'
>> > >> >> or if
>> > >> >> > you prefer it that way you can disable all constraints from the
>> > >> group.
>> > >> >> > Then your group is only a naming convention for your
>> convenience.
>> > >> >> >
>> > >> >> >>
>> > >> >> >> As a bonus I can do stuff with the stickiness of a
>> resource. For
>> > >> >> >> instance
>> > >> >> >> resource 3 fails and is retried 5 times before it fails
>> over to
>> > >> >> >> another
>> > >> >> >> node; which makes all the other resources migrate...
>> > >> >> >>
>> > >> >> >
>> > >> >> > Yes, because of the colocation constraints.
>> > >> >> >
>> > >> >> >> But....
>> > >> >> >> If I put those 5 resources in a group (colocation, order),
>> I can
>> > >> only
>> > >> >> >> use
>> > >> >> >> the stickiness of the last resource in the group. None of the
>> > >> others
>> > >> >> >> seems
>> > >> >> >> to have any vote in the matter. And if a 'midlist' resource
>> > >> fails all
>> > >> >> >> lower resources are stopped and started....
>> > >> >> >
>> > >> >> > The stickiness, no matter if it's the
>> > >> 'resource_failure_stickiness' or
>> > >> >> > the 'resource_stickiness', is bound to a resource
>> independent from
>> > >> >> where
>> > >> >> > the resource is defined in the group.
>> > >> >> >
>> > >> >>
>> > >> >> Okay.
>> > >> >>
>> > >> >> > All resources in a group are bound together by the colocation
>> > >> >> > constraints so a failing resource has influence on the whole
>> > >> group and
>> > >> >> > the score of the group. The sum of all scores of all
>> resources in a
>> > >> >> > group decides on which node the whole group has to run. So
>> if you
>> > >> >> define
>> > >> >> > a failure stickiness every failing resource lowers the group
>> score.
>> > >> >> >
>> > >> >>
>> > >> >> That has been my reasoning too... My experience tells me
>> otherwise
>> > >> >>
>> > >> >> > Because the ordering constraints are per default symmetric they
>> > >> imply
>> > >> >> > also a stop_before and not only the defined start_before
>> constraint,
>> > >> >> and
>> > >> >> > I think it makes sense most of the time ... but it can also be
>> > >> >> disabled.
>> > >> >> >
>> > >> >>
>> > >> >> Hmmm.... How do I do this exactly?
>> > >> >>
>> > >> >> > Hope that helps ;-)
>> > >> >> >
>> > >> >>
>> > >> >> I bit. I have been reasoning along the same path. The
>> behaviour of mys
>> > >> >> cluster is (very) different from what I expected...
>> > >> >>
>> >
>> > _______________________________________________
>> > Linux-HA mailing list
>> > Linux-HA at lists.linux-ha.org
>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> > See also: http://linux-ha.org/ReportingProblems
>> >
>>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
More information about the Linux-HA
mailing list