[Linux-ha-dev] Ordering of OCF Start, Stop and Monitor actions

Doug Knight dknight at wsi.com
Fri Mar 23 09:28:29 MDT 2007


On Fri, 2007-03-23 at 09:25 -0600, Alan Robertson wrote:
> Doug Knight wrote:
> > Current 2.0.8 tarball from 1/18/07. Process in top looks like:
> > 
> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEM  TIME+   COMMAND
> > 24591 root  18   0 1663m 1.5g 1028 R   83 77.8  1:19.42
> > /usr/sbin/crm_master -v 100
> > 
> > It dies and restarts about every 120 seconds, which happens to be the
> > timeout I have specified for the stop and start methods.
> > 
> > Doug
> > 
> > On Fri, 2007-03-23 at 08:20 -0600, Alan Robertson wrote:
> >> Doug Knight wrote:
> >> > Hi Alan,
> >> > I've started testing my OCF script, and I'm seeing something unusual
> >> > during initial startup. I've placed a crm_master call in my
> >> > stateful_start function, after the function has determined that it is
> >> > running on what should be the master, and postgresql has successfully
> >> > started:
> >> > 
> >> > crm_master -v 100
> >> > 
> >> > When this command gets executed, it starts using nearly 100% CPU, memory
> >> > usage continuously increases up to about 68%, then it dies (killed via
> >> > timeout?), followed by a second attempt to go master (with the same
> >> > charactistics, after the function timeout is exceeded), then a demote is
> >> > sent (again, after timeout) and it switches to try to become the slave
> >> > (crm_master -v 10 is what I use, though I'm not sure this is correct
> >> > usage to say "I want to change to a slave). Eventually, I wind up with
> >> > the resource in failed mode.
> >> > 
> >> > First question, any idea why the straight line running of a crm_master
> >> > -v 100 (not within any loops in my script) would spin up to 100%?
> >>
> >> Bugs maybe?  What version of heartbeat are you running?  Which processes
> >> are running up to 100%?  For how long?
> >>
> >> > Second question, is using the crm_master -v with different values the
> >> > way to say on which node I prefer the master to run (higher number =
> >> > preferred node)?
> >>
> >> Yes.  I believe that these are added into the values that come from
> >> other constraints in your configuration file to come up with a best
> >> configuration.
> 
> Good info.
> 
> Could you provide a few hundred lines of strace output to show us what
> it's doing?
> 

Do you mean the last few hundred lines from ha.log? Just the primary
where I'm trying to start?

> 	Thanks!
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.community.tummy.com/pipermail/linux-ha-dev/attachments/20070323/f61c4193/attachment.htm


More information about the Linux-HA-Dev mailing list