[Linux-ha-dev] serializing actions of resource scripts

Sebastian Reitenbach sebastia at l00-bugdead-prods.de
Tue Nov 20 02:38:18 MST 2007


Hi,
Andrew Beekhof <beekhof at gmail.com> wrote: 
> 
> On Nov 20, 2007, at 9:41 AM, Sebastian Reitenbach wrote:
> 
> > Hi,
> > Andrew Beekhof <beekhof at gmail.com> wrote:
> >>
> >> On Nov 19, 2007, at 6:40 PM, Sebastian Reitenbach wrote:
> >>
> >>> Hi,
> >>>
> >>> I try to add a bit memory management monitoring of services within
> >>> the Xen
> >>> resource, as I am usually interested in the services that the Xen
> >>> resource
> >>> provides and not whether Xen runs or not. I already got a lot of
> >>> valuable
> >>> feedback from Dejan, the state can be tracked here:
> >>> http://developerbugs.linux-foundation.org/show_bug.cgi?id=1778
> >>>
> >>> The problems right now are:
> >>> 1) how to handle paused domU's
> >>> 2) how to handle memory management
> >>>
> >>> The first is a bit work, but easy, before every action check in  
> >>> which
> >>> condition the domU is, e.g. running/stopped/paused/... and then
> >>> handle that
> >>> situation accordingly.
> >>>
> >>> The second one is a bit trickier.
> >>> Right now I have the Xen script working this way on startup/stop of
> >>> the
> >>> domU:
> >>> 1. check how many memory is available
> >>> 2. subtract the reserved memory for the dom0
> >>> 3. check how many domains will after that action running, divide the
> >>> result
> >>> from step two with that number
> >>> 4. set xm mem-set the memory for all virtual machines
> >>>
> >>> This is a very basic way to give all virtual machines more or less
> >>> the same
> >>> amount of memory. Nevertheless, in case more than one Xen resource  
> >>> is
> >>> starting or stopping, this behavior is prone to race conditions,  
> >>> and I
> >>> already have seen it failing.
> >>> My workaround was to create orders, with symmetrical=no and score=0
> >>> for all
> >>> Xen resources, so that only one can start at a time.
> >>
> >> clones have an ordered=true option
> >>
> >> it hasn't been used much so it may not be perfect - but if you find
> >> any problems you can be sure i'll fix them promptly
> > I am not sure, how to configure that.
> 
> same way you set clone-node-max
> 
> >
> > I have four different Xen domU's, but I think the clone would try to  
> > run the
> > same domU on different hosts, trashing my filesystem?
> 
> oh, its just a regular resource
> never mind then...  yes, rsc_order is the correct way to do this.
> 
> i was thinking that you had a clone which started a different instance  
> depending on the clone number
Ah, but when I change the resource script to start different domains based 
on the clone instance number, then I am still unable to assign location 
constraints to the clone instances. I just tried that in the GUI, but I only 
was able to select the whole clone set, but not a dedicated clone. 
I want to define different preferred locations for each Xen domU.

As I have seen that the additional order constraint needed for multiple Xen 
resources can be easily forgotten, I'd like to have a different way without 
the need of such an extra constraint.

> 
> >
> > I can configure four clone sets (clone-node-max ==
> > clone-max == 1) but that would not make more sense than four ordinary
> > resources, with an order constraint configured.
> > When I configure groups, then I cannot configure cloned resources in  
> > it.
> >
> > Also when I create a group, ordered=yes collocated=no I don't think  
> > that
> > will work, because afaik then I cannot define location constraints  
> > on the
> > Xen resources in the group, but only to the group itself.
> >
> > Can you give me any more hint what you thought how it should work?
> >
> > thanks
> > Sebastian
> >>
> >>>
> >>> Dejan suggested to add locking to the Xen resource script, but I
> >>> fear that
> >>> this will lead to new errors, e.g. assume the default-action-
> >>> timeout=30s
> >>> and you have 4 Xen resources, and all four will start up at the same
> >>> time,
> >>> then the first, will aquire the lock, the rest is waiting. Maybe
> >>> everything
> >>> will work for the second Xen resource too. But I assume then the
> >>> startup of
> >>> 3 and 4 will fail, because the default-action-timeout was hit.
> >>>
> >>> Is it possible serialize actions of a given type of resource?
> >>
> >> as above, yes :-)
> >> at least for starts and stops
> >>
> >> groups, clones and master/slave all support the ordered option and  
> >> the
> >> lrmd takes care of primitives.
> >>
> >>>
> >>> e.g. the Xen resource could be marked as serialization needed, that
> >>> means,
> >>> in case there are multiple Xen resources in a cluster, actions to
> >>> these
> >>> resources are not allowed to happen at the same time?
> >>> So that not the Xen resource script would be responsible for locking
> >>> and
> >>> allowing/disallowing actions to itself, but the CRMD is responsible.
> >>> This could be made more fine-grained, in case you say, it is not
> >>> allowed to
> >>> have action for resource Xen at the same time on the same node, but
> >>> it is
> >>> allowed to have multiple actions to the Xen resource in the cluster.
> >>
> >> that we cant do - at the crm level its cluster-wide or not at all
Anyway, a cluster wide solution on the crm level would work for me. 
Then the RA script can be marked as needing clusterwide serialized actions.
Then there would not be an additional

If that makes sense to anyone else but me, I'd go create an enhancement 
request in bugzilla.

kind regards
Sebastian



More information about the Linux-HA-Dev mailing list