[LinuxFailSafe] problems getting started
Oliver Jehle
oliver.jehle@monex.li
13 Jun 2002 06:50:29 +0200
Hi Scott
looks like cdbd has some problem or is not running... did you have a
cdbd log ???=20
please use the latest version from cvs or the version available at
ftp://ftp.suse.com/pub/projects/
they include fixes for running cdbd under glibc 2.2.4 and higher...
if you grap the cvs sources, do a make rpms and then you will find in a
script named fsinstall..... use it.. it will reinit the database, as
you have done with cdbreinit=20
after installing, you should have in /etc/init.d ....=20
or below a name fs_cluster..... start and stop with it and it should
work....=20
On Thu, 2002-06-13 at 00:41, Scott Fagg wrote:
>=20
> Is there a quick howto somewhere on the basics of getting a failsafe syst=
em up and running ? I only want to try some simple setups such as a single =
node, using failsafe to restart a crashed app or two nodes with a single se=
rvice and have failsafe move the app from node1 to node2 in the event of ap=
p or os failure on node1
>=20
> I've tried 1.0.2 RPMs , 1.0.3 source RPMSs and 1.0.4 from CVS. I've grabb=
ed the sysadm_* rpms from sourceforge. The logs below are from 1.0.3.
>=20
> I've tried cdbreinit, but it does not appear to have any effect. If i try=
to define a node or cluster using CLI , it allows me to enter the properti=
es, but it cannot save the definition.
>=20
> un 12 21:00:17 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=20
> root node not found.=20
> Jun 12 21:00:22 kai crsd[6846]: <<CI> N log 0> Additional crsd logs can b=
e=20
> found in /var/log/failsafe/crsd_kai.=20
> Jun 12 21:00:22 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:24 kai cmond[6753]: <cmond_cdb.c:816> Stale CDB handle.
> Jun 12 21:00:26 kai crsd[6846]: <<CI> E log 0> CI_CONFERR_NOTFOUND, Could=
=20
> not access root node.=20
> Jun 12 21:00:26 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:29 kai crsd[6846]: <<CI> N log 0> Additional crsd logs can b=
e=20
> found in /var/log/failsafe/crsd_kai.=20
> Jun 12 21:00:29 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:33 kai crsd[6846]: <<CI> N log 0> Additional crsd logs can b=
e=20
> found in /var/log/failsafe/crsd_kai.=20
> Jun 12 21:00:34 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:34 kai cmond[6753]: <cmond_cdb.c:816> Stale CDB handle.
> Jun 12 21:00:38 kai crsd[6846]: <<CI> E log 0> CI_CONFERR_NOTFOUND, Could=
=20
> not access root node.=20
> Jun 12 21:00:38 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:41 kai crsd[6846]: <<CI> N log 0> Additional crsd logs can b=
e=20
> found in /var/log/failsafe/crsd_kai.=20
> Jun 12 21:00:41 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:44 kai cmond[6753]: <cmond_sig.c:270> Process with pid 7602=20
> has exited with status 11
> Jun 12 21:00:44 kai cmond[6753]: <cmond_pg.c:687> Process cad:7602 of=20
> group cluster_admin exited, status =3D 0, received signal 11.
> Jun 12 21:00:44 kai cmond[6753]: <cmond_pg.c:701> Initiating recovery for=
=20
> process group cluster_admin.
> Jun 12 21:00:45 kai crsd[6846]: <<CI> E log 0> CI_CONFERR_NOTFOUND, Could=
=20
> not access root node.=20
> Jun 12 21:00:46 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:46 kai cmond[6753]: <cmond_proc.c:140> New process cad pid=20
> 7817
> Jun 12 21:00:46 kai cmond[6753]: <cmond_pg.c:767> Recovery for process=20
> group cluster_admin complete.
> Jun 12 21:00:46 kai cmond[6753]: <cmond_cdb.c:816> Stale CDB handle.
> Jun 12 21:00:50 kai crsd[6846]: <<CI> E log 0> CI_CONFERR_NOTFOUND, Could=
=20
> not access root node.=20
> Jun 12 21:00:50 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:53 kai crsd[6846]: <<CI> N log 0> Additional crsd logs can b=
e=20
> found in /var/log/failsafe/crsd_kai.=20
> Jun 12 21:00:53 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:56 kai cmond[6753]: <cmond_cdb.c:816> Stale CDB handle.
> Jun 12 21:00:56 kai cmond[6753]: <cmond_sig.c:270> Process with pid 7817=20
> has exited with status 11
> Jun 12 21:00:56 kai cmond[6753]: <cmond_pg.c:687> Process cad:7817 of=20
> group cluster_admin exited, status =3D 0, received signal 11.
> Jun 12 21:00:56 kai cmond[6753]: <cmond_pg.c:701> Initiating recovery for=
=20
> process group cluster_admin.
> Jun 12 21:00:57 kai crsd[6846]: <<CI> E log 0> CI_CONFERR_NOTFOUND, Could=
=20
> not access root node.=20
> Jun 12 21:00:58 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:00:58 kai cmond[6753]: <cmond_proc.c:140> New process cad pid=20
> 7923
> Jun 12 21:00:58 kai cmond[6753]: <cmond_pg.c:767> Recovery for process=20
> group cluster_admin complete.
> Jun 12 21:00:58 kai cmond[6753]: <cmond_cdb.c:816> Stale CDB handle.
> Jun 12 21:01:02 kai crsd[6846]: <<CI> E log 0> CI_CONFERR_NOTFOUND, Could=
=20
> not access root node.=20
> Jun 12 21:01:02 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:01:06 kai crsd[6846]: <<CI> N log 0> Additional crsd logs can b=
e=20
> found in /var/log/failsafe/crsd_kai.=20
> Jun 12 21:01:06 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
> root node not found.=20
> Jun 12 21:01:08 kai cmond[6753]: <cmond_cdb.c:816> Stale CDB handle.
> Jun 12 21:01:09 kai crsd[6846]: <<CI> E log 0> CI_CONFERR_NOTFOUND, Could=
=20
> not access root node.=20
> Jun 12 21:01:09 kai crsd[6846]: <<CI> E crs 0> CI_ERR_HDL_STALE, Database=
=20
>=20
>=20
> Scott Fagg <scott.fagg@arup.com.au>
> Arup Brisbane
> (07) 3023 6000
>=20
> _______________________________________________
> LinuxFailSafe mailing list
> LinuxFailSafe@lists.community.tummy.com
> http://lists.community.tummy.com/mailman/listinfo/linuxfailsafe
--=20
Oliver Jehle =09
Monex AG
F=F6hrenweg 18
FL-9496 Balzers
Tel: +423 388 1988
Fax: +423 388 1980
----
I've not lost my mind. It's backed up on tape somewhere.
----