[Linux-HA] ipfail did not triggered failover

Pavol Gono pavol_gono at yahoo.com
Thu Aug 4 03:53:16 MDT 2005


Hi

I have looked at logs once again. In second error situation it
is clear - our faulty resource script was blocked and this was
the reason why failover was waiting. You can see
"Aug  3 18:47:45 Hig50v3Ps heartbeat: debug: Starting
/etc/ha.d/resource.d/start_hiq30  start"
but there wasn't "...start_hiq30 start done"

But please look once again at first error situation (Aug  3
17:52:27 Hig50v3Pm). There is no previous unfinished resource
script, and I fear there is another bigger problem.
Unfortunately, I don't have state of processes from this
situation, at Aug  3 18:25:32 m-machine was restarted. What can
be reason for failover hanging?

Pavol


--- Alan Robertson <alanr at unix.sh> wrote:

> Pavol Gono wrote:

> > After killing this ssh, failover was immediately in
> progress.
> > 
> > Is it possible that blocking resource processes can block
> the
> > whole failover?
> 
> It is not merely possible.  It is certain.
> 
> Some things take a long time to stop.  Heartbeat is patient. 
> :-)
> 
> 
> -- 
>      Alan Robertson <alanr at unix.sh>



		
____________________________________________________
Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 


More information about the Linux-HA mailing list