[LinuxFailSafe] problems getting started
Scott Fagg
scott.fagg@arup.com
Sun, 16 Jun 2002 16:19:48 +1000
Things are getting weirder.
show node A returns 'Machine (A) is not defined.'
BUT
define node A fails with 'Machine (A) exists.'
Is there some good documentation somewhere on how to get started and =
create a trivial setup ?
I'm also having trouble with fstask 1.0.3. Some commands work while others =
just seem to never respond. I tried compiling FailSafe-mgr 1.0.4 from cvs, =
but it hangs forever when it gets to the first javac statement. (using =
IBM's 118) If i use IBM Java2 1.3 the compile fails.
I do have a copy of fstask from 1.0.3 which will run with Java2 1.3 but =
not with Java 1.1.8 !
To top it off my log files are filling up with :
=3D=3D> cmond_log <=3D=3D
Sun Jun 16 16:02:36.208 <cmond 18790:1024> <cmond_proc.c:178> Starting =
process crsd.
Sun Jun 16 16:02:36.208 <cmond 18790:1024> <cmond_proc.c:97> Going to =
fork/exec new process "crsd -l ".
Sun Jun 16 16:02:36.209 <cmond 18790:1024> <cmond_proc.c:140> New process =
crsd pid 19376
Sun Jun 16 16:02:36.209 <cmond 18790:1024> <cmond_pg.c:767> Recovery for =
process group cluster_control complete.
Sun Jun 16 16:02:36.545 <cmond 18790:1024> <cmond_sig.c:270> Process with =
pid 19376 has exited with status 11
Sun Jun 16 16:02:36.545 <cmond 18790:1024> <cmond_sig.c:274> 1 processes =
have exited.
Sun Jun 16 16:02:36.545 <cmond 18790:1024> <cmond_pg.c:687> Process =
crsd:19376 of group cluster_control exited, status =3D 0, received signal =
11.
Sun Jun 16 16:02:36.545 <cmond 18790:1024> <cmond_pg.c:701> Initiating =
recovery for process group cluster_control.
I'm on the verge of giving up. Is it meant to be this difficult ?!
<<< "Scott Fagg" <scott.fagg@arup.com> 6/16 2:53p >>>
I think the problem occured because portmap didn't start after the reboot.
Now that i've got portmap running, the error doesn't occur. Is that the =
mechanism by which parts of FS get at the configuration information ? I'm =
guessing there's a daemon running that all components talk to to get =
config info, and that RPC is used to communicate with the daemon ?
<<< Lars Marowsky-Bree <lmb@suse.de> 6/15 10:35p >>>
On 2002-06-15T19:52:32,
Scott Fagg <scott.fagg@arup.com> said:
> Sat Jun 15 19:23:43.871 <E anonymous log 3801:0 ci_log_cdb.c:207> =
CI_CONFERR_NOTFOUND, Logging configuration error: could not read cluster =
database /var/lib/failsafe/cdb/cdb.db, cdb error =3D 3.
> Could not open database /var/lib/failsafe/cdb/cdb.db
>=20
>=20
> ... and /var/log/messages contains this :
>=20
> Could not open configuration database.
>=20
> Any thoughts ?
No. Not enough data(tm); what does cdbd_log say on the two nodes?
Sincerely,
Lars Marowsky-Br=E9e <lmb@suse.de>
--=20
Immortality is an adequate definition of high availability for me.
--- Gregory F. Pfister
=20
_______________________________________________
LinuxFailSafe mailing list
LinuxFailSafe@lists.community.tummy.com
http://lists.community.tummy.com/mailman/listinfo/linuxfailsafe
=20