[Linux-ha-dev] hb report - troubles on 4 node cluster
Andreas Mather1
andreas.mather at at.ibm.com
Sun Feb 10 13:38:45 MST 2008
***********************
Warning: Your file, report_1.tar.gz, contains more than 32 files after decompression and cannot be scanned.
***********************
Hi all,
Please find attached a hb_report for a problem I experienced when
implementing heartbeat.
The environment:
It's an asymmetric 4 node cluster, running heartbeat 2.1.3. All nodes share
a couple of filesystems, all GPFS formatted. Services inlcude WebSphere
(modified RA), DB2 (modified RA), vsftpd (Xinetd), samba, nfs, MCS (self
written RA), IHS and are put in 4 groups (filesvc, mcs, was, db). Dejan is
also familiar with the setup.
OS: SLES 9.3 (x86_64)
hearbeat: build via ./ConfigureMe package
The Problem:
In general, everything works fine (crm_standby works for every node, etc.),
but, when I simulate a power loss of one node (via IBM RSA)*, a cluster
split occurs when this node rejoins. Suddenly, on every node, crm_mon shows
the node it is running on as 'online' while reporting the other nodes as
'OFFLINE'. After 1 - 2 min. the cluster is fully operational again (all
nodes found themself again), but it seems as every resource gets restarted.
Please let me know, if I can provide further information.
Thanks,
Andreas
* Sorry, I forgot to test what happens, when I just stop and start
heartbeat on that node - would be useful too, I think... :(
(See attached file: report_1.tar.gz)
Mit freundlichen Grüßen / Best regards
Andreas MATHER
ESLT - Enterprise Services for Linux Technologies
IBM Austria, Obere Donaustrasse 95, 1020 Vienna
Phone : +43-1-21145/4799
Fax: +43-1-21145/8888
e-mail: andreas.mather at at.ibm.com
IBM Österreich Internationale Büromaschinen Gesellschaft m.b.H.
Sitz: Wien
Firmenbuchgericht: Handelsgericht Wien, FN 80000y
-------------- next part --------------
A non-text attachment was scrubbed...
Name: report_1.tar.gz
Type: application/octet-stream
Size: 146630 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha-dev/attachments/20080210/f949917d/report_1.tar-0001.obj
More information about the Linux-HA-Dev
mailing list