[LinuxFailSafe] FailSafe does not failover resource group

Padmanabhan Sreenivasan paddy@sgi.com
Tue, 25 Feb 2003 09:36:04 -0800


Erehwin Ureta wrote:
> 
> Need help.
> 
> I have managed to install FailSafe 1.0.4 in Redhat 7.2
> I configured two nodes with IP and Apache resources. They are part of one
> resource group. I am testing the failover process by switching off the
> machine that initially hosts the resource group but it does not seem to
> failover to the other node. I can, however, do an "admin move" of the group.
> "haStatus -a" reports that everything is working fine.

You need to define a system controller. For testing, you can use "null" as your system controller
and keep system controller enabled.

> 
> The only thing I could point to is that I am not using a system controller.
> I don't think it is the cause though. From what I understand(and from the
> list archives), its use is to reset the other node to protect data in a
> shared storage configuration. I don't use any shared storage so I do not
> think I should use sysctl.
> 
> I am not sure which /var/log/failsafe log to look at either.

You should look at *cmsd* logs in the directory. The cluster
membership process will attempt a reset when a node is
down and it will fail if system controller is not configured.

Paddy
> 
> TIA
> 
> Here's the result of haStatus -a:
> Cluster fscluster:
>         Cluster state is ACTIVE.
> Node fsafe2:
>         State of machine is UP.
>         Logical Machine Name: fsafe2
>         Hostname: fsafe2.localdomain
>         Is FailSafe: true
>         Nodeid: 2
>         Reset type: powerCycle
>         ControlNet Ipaddr: 192.168.170.152
>         ControlNet HB: true
>         ControlNet Control: true
>         ControlNet Priority: 1
>         ControlNet Ipaddr: 10.1.1.2
>         ControlNet HB: true
>         ControlNet Control: true
>         ControlNet Priority: 2
> Node fsafe1:
>         State of machine is UP.
>         Logical Machine Name: fsafe1
>         Hostname: fsafe1.localdomain
>         Is FailSafe: true
>         Nodeid: 1
>         Reset type: powerCycle
>         ControlNet Ipaddr: 192.168.170.151
>         ControlNet HB: true
>         ControlNet Control: true
>         ControlNet Priority: 1
>         ControlNet Ipaddr: 10.1.1.1
>         ControlNet HB: true
>         ControlNet Control: true
>         ControlNet Priority: 2
> Resource_group ipgroup:
>         State: Online
>         Error: No error
>         Owner: fsafe1
>         Failover Policy: fsafe1-fsafe2
>                 Version: 1
>                 Script: ordered
>                 Attributes: Auto_Recovery Auto_Failback
>                 Initial AFD: fsafe1 fsafe2
>          Resources:
>                 192.168.170.150 (type: IP_address)
> Resource_group webgroup:
>         State: Online
>         Error: No error
>         Owner: fsafe2
>         Failover Policy: fsafe2-fsafe1
>                 Version: 1
>                 Script: ordered
>                 Attributes: Auto_Recovery Auto_Failback
>                 Initial AFD: fsafe2 fsafe1
>          Resources:
>                 web     (type: Apache)
>                 192.168.170.170 (type: IP_address)
> Resource web (type Apache):
>         State: Online
>         Error: None
>         Owner: fsafe2
>         Flags: Resource is not locally monitored
>         port-number: 80
>         monitor-level: 2
>         default-page-location: /var/www/html/index.html
>         web-ipaddr: 192.168.170.170
>         server-root: /etc/httpd
>         Resource dependencies
>         IP_address 192.168.170.170
> Resource 192.168.170.170 (type IP_address):
>         State: Online
>         Error: None
>         Owner: fsafe2
>         Flags: Resource is not locally monitored
>         BroadcastAddress: 192.168.170.255
>         interfaces: eth0,eth1
>         NetworkMask: 255.255.255.0
>         No resource dependencies
> Resource 192.168.170.150 (type IP_address):
>         State: Online
>         Error: None
>         Owner: fsafe1
>         Flags: Resource is monitored locally
>         BroadcastAddress: 192.168.170.255
>         interfaces: eth0,eth1
>         NetworkMask: 255.255.255.0
>         No resource dependencies
> Failover_policy fsafe1-fsafe2:
>         Version: 1
>         Script: ordered
>         Attributes: Auto_Recovery Auto_Failback
>         Initial AFD: fsafe1 fsafe2
> Failover_policy fsafe2-fsafe1:
>         Version: 1
>         Script: ordered
>         Attributes: Auto_Recovery Auto_Failback
>         Initial AFD: fsafe2 fsafe1
> 
> _________________________________________________________________
> Add photos to your e-mail with MSN 8. Get 2 months FREE*.
> http://join.msn.com/?page=features/featuredemail
> 
> _______________________________________________
> LinuxFailSafe mailing list
> LinuxFailSafe@lists.community.tummy.com
> http://lists.community.tummy.com/mailman/listinfo/linuxfailsafe