[LinuxFailSafe] general problems during setup
Leo Kirch
ochi_crack@hotmail.com
Thu, 24 Oct 2002 12:04:51 +0000
hello & sorry, this will be a a long text
i'm trying to create a running fs_environment with failsafe.
what i did so long:
1. got the latest sources via cvs (failsafe,cluster,sysadm,stonith..)
2. compiled without error
3. installed everything as there are:
failsafe-1.0.4-1
cluster_admin-1.0.4-1
cluster_services-1.0.4-1
sysadm_base-tcpmux-1.3.7-1
sysadm_base-lib-1.3.7-1
sysadm_base-server-1.3.7-1
sysadm_base-client-1.3.7-1
sysadm_failsafe-server-0.9.1a-5
sysadm_failsafe-client-0.9.1a-5
heartbeat-stonith-0.4.9.2-1
4. services
sgi-cmsd 17000/udp
sgi-crsd 17001/udp
sgi-gcd 17002/udp
sgi-cad 17003/tcp
tcpmux 1/tcp
tcpmux 1/udp
5. portmap running, xinetd running, tcpmux+fam running
6. killed all fs processes / cdbreinit
7. defined on two really "clean" machines netdevices,services...
8. hwnode01, hwnode02, defined a cluster, configured stonith
with my apcmaster, installed (really insecure) rsh to verify
network connectivity, got a scsi director ... okidoki
9. everything seems well configured so far, but..
cad_log:
cfs_fs_connect: fs_cam_register failed with error FailSafe is not
ready to accept admin requests
cmond_log:
<cmond_pg.c:702> Initiating recovery for process group cluster_hainfra.
<cmond_proc.c:178> Starting process ha_gcd.
<cmond_proc.c:98> Going to fork/exec new process "ha_gcd -l ".
<cmond_proc.c:141> New process ha_gcd pid 4110
<cmond_pg.c:768> Recovery for process group cluster_hainfra complete.
<cmond_sig.c:271> Process with pid 4102 has exited with status 256
<cmond_sig.c:275> 1 processes have exited.
<cmond_pg.c:687> Process ha_cmsd:4102 of group cluster_hainfra exited,
status = 1.
ifd_hwnode01:
<I0 ha_ifd ifd 2797:0 ifd_net.c:413> CI_FAILURE, Marking eth0 as
configured DOWN
<I0 ha_ifd ifd 2797:0 ifd_net.c:443> interface eth0 has been restored
gcd_hwnode01:
<I0 ha_gcd gcd 4262:0 gcd_options.c:721> Value of gcd_incno = 128.
<N ha_gcd gcd 4262:0 gcd_init.c:206> My node name = hwnode01.
<E ha_gcd gcd 4262:0 gcd_cms.c:140> CI_IPCERR_NOCONN, cms_register()
failed.
<E ha_gcd gcd 4262:0 gcd_init.c:237> CI_IPCERR_NOCONN, No IPC connection
to CMSD. Cleaning up and restarting. Bye for now!
10. when i start up the failsafe services on both nodes node1 is inactive
then
status unknown, node2 is always unknown
11. node02 gets the complete cdb database
thats it. thanks a lot for helping me out of this mess,
yours sebastian
_________________________________________________________________
Surf the Web without missing calls! Get MSN Broadband.
http://resourcecenter.msn.com/access/plans/freeactivation.asp