[LinuxFailSafe] 'srmd executable error'? Has it been resolved?

Gerard Hynes ghynes-no-spam-o@colltech.com
20 May 2002 13:20:23 -0500


--=-sOl8Z/3KJR1yp2aj9rJG
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Just got back to working with FailSafe after a couple of weeks on=20
another project.....=20

Anyway - grabbed the latest CVS from ftp.suse.com - nice work there=20
folks!!  Compiles nicely on RH-7.2/GLIBC-2.2.4 and RH-7.3/GLIBC-2.2.5.=20

However - I am not faced with a small dilemma:=20

I don't use the GUI, as I'd like to script as much of this as possible.=20
I can do an RPM install, initialize the database, run a canned build
script which defines 2 machines - each with a public and private
interface, a shared IP address and a shared Apache resource.  So far=20
so good.   Currently (during this test phase) I am using SSH/STONITH=20
and that seems to work (albeit in a _really_ brutal fashion).=20

However - upon doing a haActivate - I get errors in the log file about=20
the IP_address script failing and a ''srmd executable error'' when doing
a haStatus.=20

I noted a similar thread recently in the mailing list.  Has there been=20
a resolution to this?=20

Here's a snippet from haStatus -a ......=20

----- Cut Here -----=20

Cluster HA-CLUSTER:=20

        Cluster state is ACTIVE.=20

        Cluster Notify Cmd: "/bin/mail"=20
        Cluster Notify Address: "fsafe_admin@localhost"=20


Node machine-b:=20

        State of machine is UP.=20

        Logical Machine Name: machine-b=20
        Hostname: machine-b=20
        Is FailSafe: true=20
        Nodeid: 2=20
        Reset type: powerCycle=20
        System Controller: stonith=20
        System Controller status: enabled=20
        System Controller owner: machine-a=20
        System Controller owner device: ssh=20
        System Controller owner type: tty=20
        ControlNet Ipaddr: 192.168.0.2=20
        ControlNet HB: true=20
        ControlNet Control: true=20
        ControlNet Priority: 1=20
        ControlNet Ipaddr: 10.8.0.66=20
        ControlNet HB: true=20
        ControlNet Control: true=20
        ControlNet Priority: 2=20


Node machine-a:=20

        State of machine is UP.=20

        Logical Machine Name: machine-a=20
        Hostname: machine-a=20
        Is FailSafe: true=20
        Nodeid: 1=20
        Reset type: powerCycle=20
        System Controller: stonith=20
        System Controller status: enabled=20
        System Controller owner: machine-b=20
        System Controller owner device: ssh=20
        System Controller owner type: tty=20
        ControlNet Ipaddr: 192.168.0.1=20
        ControlNet HB: true=20
        ControlNet Control: true=20
        ControlNet Priority: 1=20
        ControlNet Ipaddr: 10.8.0.65=20
        ControlNet HB: true=20

Resource_group RG1:=20

        State: Online=20
        Error: srmd executable error=20
        Owner: machine-b=20

        Failover Policy: IP-FAIL=20
                Version: 1=20
                Script: ordered=20
                Attributes: Inplace_Recovery InPlace_Recovery
		Controlled_Failback=20
                Initial AFD: machine-a machine-b=20

        Resources:=20
                WebServer       (type: Apache)=20
                10.8.0.165      (type: IP_address)=20


Resource WebServer (type Apache):=20

        State: Offline=20
        Error: None=20
        Owner: none=20
        Flags: Resource is not locally monitored=20

        port-number: 80=20
        monitor-level: 1=20
        default-page-location: /var/www/html/index.html=20
        web-ipaddr: 10.8.0.165=20
        server-root: /etc/httpd/conf=20
        Resource dependencies=20
        IP_address 10.8.0.165=20


Resource 10.8.0.165 (type IP_address):=20

        State: Offline=20
        Error: None=20
        Owner: none=20
        Flags: Resource is not locally monitored=20

        BroadcastAddress: 10.8.0.255=20
        interfaces: eth0:0=20
        NetworkMask: 0xffffff00=20
        No resource dependencies=20

Failover_policy IP-FAIL:=20

        Version: 1=20
        Script: ordered=20
        Attributes: Inplace_Recovery InPlace_Recovery
	Controlled_Failback=20
        Initial AFD: machine-a machine-b=20


----- Cut here -----


Ideas anyone?  I'll pull out the log files momentarily and dig
deeper - but the IP_address script seems to be failing badly.

Thanks in advance,

--=20
=3D[gh]=3D
ghynes-no-spam-o@colltech.com < remove the no-spam-o >
GPG Fingerprint =3D E944 5617 4FE6 C950 407C 0A72 187F E01A 1FC5 0A08

--=-sOl8Z/3KJR1yp2aj9rJG
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQA86T5nGH/gGh/FCggRAjyOAJ4rR4vqCJ8qytkBjqEYSaBdlTeA4gCfZxMT
mAKh9mDNqrYzAzmSQZ9W5n0=
=URmc
-----END PGP SIGNATURE-----

--=-sOl8Z/3KJR1yp2aj9rJG--