[LinuxFailSafe] general problems during setup

Leo Kirch ochi_crack@hotmail.com
Thu, 24 Oct 2002 12:04:51 +0000


hello & sorry, this will be a a long text

i'm trying to create a running fs_environment with failsafe.
what i did so long:
1. got the latest sources via cvs (failsafe,cluster,sysadm,stonith..)
2. compiled without error
3. installed everything as there are:
	failsafe-1.0.4-1
	cluster_admin-1.0.4-1
	cluster_services-1.0.4-1
	sysadm_base-tcpmux-1.3.7-1
	sysadm_base-lib-1.3.7-1
	sysadm_base-server-1.3.7-1
	sysadm_base-client-1.3.7-1
	sysadm_failsafe-server-0.9.1a-5
	sysadm_failsafe-client-0.9.1a-5
	heartbeat-stonith-0.4.9.2-1
4. services
	sgi-cmsd        17000/udp
	sgi-crsd        17001/udp
	sgi-gcd         17002/udp
	sgi-cad         17003/tcp
	tcpmux          1/tcp
	tcpmux          1/udp
5. portmap running, xinetd running, tcpmux+fam running
6. killed all fs processes / cdbreinit
7. defined on two really "clean" machines netdevices,services...
8. hwnode01, hwnode02, defined a cluster, configured stonith
   with my apcmaster, installed (really insecure) rsh to verify
   network connectivity, got a scsi director ... okidoki
9. everything seems well configured so far, but..

cad_log:
   cfs_fs_connect: fs_cam_register failed with error FailSafe is not
   ready to accept admin requests
cmond_log:
   <cmond_pg.c:702> Initiating recovery for process group cluster_hainfra.
   <cmond_proc.c:178> Starting process ha_gcd.
   <cmond_proc.c:98> Going to fork/exec new process "ha_gcd -l ".
   <cmond_proc.c:141> New process ha_gcd pid 4110
   <cmond_pg.c:768> Recovery for process group cluster_hainfra complete.
   <cmond_sig.c:271> Process with pid 4102 has exited with status 256
   <cmond_sig.c:275> 1 processes have exited.
   <cmond_pg.c:687> Process ha_cmsd:4102 of group cluster_hainfra exited, 
status = 1.
ifd_hwnode01:
   <I0 ha_ifd ifd 2797:0 ifd_net.c:413> CI_FAILURE, Marking eth0 as 
configured DOWN
   <I0 ha_ifd ifd 2797:0 ifd_net.c:443> interface eth0 has been restored
gcd_hwnode01:
   <I0 ha_gcd gcd 4262:0 gcd_options.c:721> Value of gcd_incno = 128.
   <N ha_gcd gcd 4262:0 gcd_init.c:206> My node name = hwnode01.
   <E ha_gcd gcd 4262:0 gcd_cms.c:140> CI_IPCERR_NOCONN, cms_register() 
failed.
   <E ha_gcd gcd 4262:0 gcd_init.c:237> CI_IPCERR_NOCONN, No IPC connection 
to CMSD. Cleaning up and restarting. Bye for now!

10. when i start up the failsafe services on both nodes node1 is inactive 
then
   status unknown, node2 is always unknown
11. node02 gets the complete cdb database

thats it. thanks a lot for helping me out of this mess,
yours sebastian

_________________________________________________________________
Surf the Web without missing calls! Get MSN Broadband. 
http://resourcecenter.msn.com/access/plans/freeactivation.asp