[Linux-HA] RE: CCM causes Segfault !
Tony Scott
tony-j-scott at hotmail.com
Thu Mar 2 12:25:07 MST 2006
This is what i've found out so far - but now I need expert help, please :o);
init_membership() in hbagent.c calls saClmInitialize in the ccm library
(ccmlib_clm.c) ...
saClmInitialize uses oc_ev_set_callback to register its function ccm_events
as a callback function.
The ccm_events callback function is the function which sets up the
__ccm_data pointer with a non NULL value.
Next, in init_membership(), the ccmlib_clm.c function
saClmClusterTrackStart is called.
It does the following:
const oc_ev_membership_t *oc;
...
...
oc = __ccm_data;
itemnum = oc->m_n_member;
However,
The problem is that this callback function "ccm_events" is never being
called, so __ccm_data remains as a NULL pointer....
and the "itemnum = oc->m_n_member;" causes a segfault (Normal for
referencing a NULL pointer :o))
Does _anybody_ know why ccm would not call the callback function
"ccm_events" ??
How can I please ccm to fix this problem?
Im am just running heartbeat 1.2.4 software "from the box"
>From: "Tony Scott" <tony-j-scott at hotmail.com>
>To: linux-ha at lists.linux-ha.org
>CC: tony-j-scott at hotmail.com
>Subject: RE: CCM causes Segfault !
>Date: Thu, 02 Mar 2006 11:39:44 -0500
>
>When I run hbagent, the following is output to /var/log/messages :
>
>
>Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node 1: cluster2,
>type: normal, status: active
>Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node 2: cluster1,
>type: normal, status: active
>Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster2,
>interface: /dev/ttyS1, status: dead
>Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster2,
>interface: bond0, status: up
>Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster1,
>interface: /dev/ttyS1, status: dead
>Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: node: cluster1,
>interface: bond0, status: up
>Mar 2 16:21:02 cluster1 lha-snmpagent[29940]: info: g_hash_table_insert hd
>= [0x84c58c8]
>Mar 2 16:21:02 cluster1 ccm[29853]: WARN: ipc channel blocked
>Mar 2 16:21:02 cluster1 last message repeated 2 times
>Mar 2 16:21:02 cluster1 ccm[29853]: info: dispatch:received HUP
>Mar 2 16:21:02 cluster1 ccm[29853]: info: clntCh_input_destroy:received
>HUP
>
>
>
>>From: "Tony Scott" <tony-j-scott at hotmail.com>
>>To: linux-ha at lists.linux-ha.org
>>CC: tony-j-scott at hotmail.com
>>Subject: CCM causes Segfault !
>>Date: Thu, 02 Mar 2006 06:28:29 -0500
>>
>>Hi all,
>>
>>I'm using heartbeat 1.2.4
>>
>>I get the folowing segfault when I start the "hbagent"
>>
>>(gdb) core core.12571
>>Core was generated by `./hbagent -d'.
>>Program terminated with signal 11, Segmentation fault.
>>Reading symbols from /usr/lib/libnetsnmpagent.so.9...done.
>>Loaded symbols for /usr/lib/libnetsnmpagent.so.9
>>Reading symbols from /usr/lib/libnetsnmpmibs.so.9...done.
>>Loaded symbols for /usr/lib/libnetsnmpmibs.so.9
>>Reading symbols from /usr/lib/libnetsnmphelpers.so.9...done.
>>Loaded symbols for /usr/lib/libnetsnmphelpers.so.9
>>Reading symbols from /usr/lib/libnetsnmp.so.9...done.
>>Loaded symbols for /usr/lib/libnetsnmp.so.9
>>Reading symbols from /usr/lib/libsensors.so.3...done.
>>Loaded symbols for /usr/lib/libsensors.so.3
>>Reading symbols from /usr/lib/librpm-4.3.so...done.
>>Loaded symbols for /usr/lib/librpm-4.3.so
>>Reading symbols from /usr/lib/librpmdb-4.3.so...done.
>>Loaded symbols for /usr/lib/librpmdb-4.3.so
>>Reading symbols from /lib/libselinux.so.1...done.
>>Loaded symbols for /lib/libselinux.so.1
>>Reading symbols from /usr/lib/librpmio-4.3.so...done.
>>Loaded symbols for /usr/lib/librpmio-4.3.so
>>Reading symbols from /usr/lib/libbeecrypt.so.6...done.
>>Loaded symbols for /usr/lib/libbeecrypt.so.6
>>Reading symbols from /lib/tls/libpthread.so.0...done.
>>Loaded symbols for /lib/tls/libpthread.so.0
>>Reading symbols from /usr/lib/libpopt.so.0...done.
>>Loaded symbols for /usr/lib/libpopt.so.0
>>Reading symbols from /usr/lib/libbz2.so.1...done.
>>Loaded symbols for /usr/lib/libbz2.so.1
>>Reading symbols from /usr/lib/libz.so.1...done.
>>Loaded symbols for /usr/lib/libz.so.1
>>Reading symbols from /lib/libcrypto.so.4...done.
>>Loaded symbols for /lib/libcrypto.so.4
>>Reading symbols from /usr/lib/libelf.so.1...done.
>>Loaded symbols for /usr/lib/libelf.so.1
>>Reading symbols from /lib/tls/libm.so.6...done.
>>Loaded symbols for /lib/tls/libm.so.6
>>Reading symbols from /usr/lib/libwrap.so.0...done.
>>Loaded symbols for /usr/lib/libwrap.so.0
>>Reading symbols from /usr/lib/libplumb.so.0...done.
>>Loaded symbols for /usr/lib/libplumb.so.0
>>Reading symbols from /usr/lib/libhbclient.so.0...done.
>>Loaded symbols for /usr/lib/libhbclient.so.0
>>Reading symbols from /usr/lib/libccmclient.so.0...done.
>>Loaded symbols for /usr/lib/libccmclient.so.0
>>Reading symbols from /usr/lib/libclm.so.0...done.
>>Loaded symbols for /usr/lib/libclm.so.0
>>Reading symbols from /usr/lib/libglib-1.2.so.0...done.
>>Loaded symbols for /usr/lib/libglib-1.2.so.0
>>Reading symbols from /lib/tls/libc.so.6...done.
>>Loaded symbols for /lib/tls/libc.so.6
>>Reading symbols from /lib/libuuid.so.1...done.
>>Loaded symbols for /lib/libuuid.so.1
>>Reading symbols from /lib/tls/librt.so.1...done.
>>Loaded symbols for /lib/tls/librt.so.1
>>Reading symbols from /lib/libdl.so.2...done.
>>Loaded symbols for /lib/libdl.so.2
>>Reading symbols from /lib/ld-linux.so.2...done.
>>Loaded symbols for /lib/ld-linux.so.2
>>Reading symbols from /usr/lib/libgssapi_krb5.so.2...done.
>>Loaded symbols for /usr/lib/libgssapi_krb5.so.2
>>Reading symbols from /usr/lib/libkrb5.so.3...done.
>>Loaded symbols for /usr/lib/libkrb5.so.3
>>Reading symbols from /lib/libcom_err.so.2...done.
>>Loaded symbols for /lib/libcom_err.so.2
>>Reading symbols from /usr/lib/libk5crypto.so.3...done.
>>Loaded symbols for /usr/lib/libk5crypto.so.3
>>Reading symbols from /lib/libresolv.so.2...done.
>>Loaded symbols for /lib/libresolv.so.2
>>Reading symbols from /lib/libnsl.so.1...done.
>>Loaded symbols for /lib/libnsl.so.1
>>Reading symbols from /lib/libnss_files.so.2...done.
>>Loaded symbols for /lib/libnss_files.so.2
>>#0 saClmClusterTrackStart (clmHandle=0xbfe73544, trackFlags=1 '\001',
>>notificationBuffer=0x8ef28fc, numberOfItems=2) at ccmlib_clm.c:468
>>468 ccmlib_clm.c: No such file or directory.
>> in ccmlib_clm.c
>>(gdb) where
>>#0 saClmClusterTrackStart (clmHandle=0xbfe73544, trackFlags=1 '\001',
>>notificationBuffer=0x8ef28fc, numberOfItems=2) at ccmlib_clm.c:468
>>#1 0x0804a1ac in init_membership ()
>>#2 0x0804abfd in main ()
>>(gdb) frame 0
>>#0 saClmClusterTrackStart (clmHandle=0xbfe73544, trackFlags=1 '\001',
>>notificationBuffer=0x8ef28fc, numberOfItems=2) at ccmlib_clm.c:468
>>468 in ccmlib_clm.c
>>(gdb) print oc
>>$1 = (const oc_ev_membership_t *) 0x0
>>(gdb) print __ccm_data
>>$2 = (const oc_ev_membership_t *) 0x0
>>(gdb)
>>
>>this line of code in saClmClusterTrackStart is referencing a NULL pointer
>>( the NULL pointer being oc )
>>itemnum = oc->m_n_member;
>>
>>#######################################################
>>
>>Am I using ccm and hbagent correctly ?
>>
>>in /etc/ha.d/ha.cf, I have the following for ccm:
>>apiauth ccm gid=haclient
>>respawn hacluster /usr/lib/heartbeat/ccm
>>
>>so, ccm is started when heartbeat is started.
>>
>>I then start hbagent myself at the command line:
>>./hbagent -d &
>>
>>and get the following debug from it:
>>[root at cluster1 heartbeat]# ./hbagent -d &
>>[1] 13879
>>[root at cluster1 heartbeat]# lha-snmpagent: 2006/03/02_11:16:11 debug:
>>PID=13879
>>lha-snmpagent: 2006/03/02_11:16:11 debug: Signing in with heartbeat
>>lha-snmpagent: 2006/03/02_11:16:11 info: node 1: cluster2, type: normal,
>>status: active
>>lha-snmpagent: 2006/03/02_11:16:11 info: node 2:
>>cluster1.ct.uk.videonetworks.com, type: normal, status: active
>>lha-snmpagent: 2006/03/02_11:16:11 info: node: cluster2, interface:
>>/dev/ttyS1, status: dead
>>lha-snmpagent: 2006/03/02_11:16:11 info: node: cluster2, interface: bond0,
>>status: up
>>lha-snmpagent: 2006/03/02_11:16:11 info: node: cluster1, interface:
>>/dev/ttyS1, status: dead
>>lha-snmpagent: 2006/03/02_11:16:11 info: node: cluster1, interface: bond0,
>>status: up
>>lha-snmpagent: 2006/03/02_11:16:11 info: g_hash_table_insert hd =
>>[0x9f1e8c8]
>>
>>[1]+ Segmentation fault (core dumped) ./hbagent -d
>>
>>(That is the segfault described above.)
>>
>>Can anybody tell me the correct initialization sequence for ccm and
>>hbagent ?
>>Is ccm supposed to be started as I have started it using respawn in ha.cf
>>??
>>Is hbagent supposed to be started manually from the command line as I have
>>done ??
>>
>>Any help would be much apprecieted.
>>Thanks,
>>Tony
>>
>>_________________________________________________________________
>>Express yourself instantly with MSN Messenger! Download today - it's FREE!
>>http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>>
>
>_________________________________________________________________
>FREE pop-up blocking with the new MSN Toolbar get it now!
>http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
>
_________________________________________________________________
Dont just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/
More information about the Linux-HA
mailing list