[Linux-ha-dev] [RFC] heartbeat-2.1.4

Dejan Muhamedagic dejanmm at fastmail.fm
Tue Apr 15 03:40:34 MDT 2008


Hi Yamauchi-san,

On Tue, Apr 15, 2008 at 05:41:46PM +0900, HIDEO YAMAUCHI wrote:
> Hi,
> 
> I used Heartbeat-STABLE-2-1-932f11969945.
> 
> I did a simple test, but one problems occurred.
> 
> 1.crmd causes SIGSEGV.
>  It happened after having caused the monitor error of the Master resource in Master/Slave.
> 
> ----------------------
> Master/Slave Set: ms-sf
>     Resource Group: ms-sf_group:0
>         master_slave_Stateful1:0        (ocf::heartbeat:Stateful1):     Master dl380g5c        
> 		master_slave_Stateful2:0        (ocf::heartbeat:Stateful2):     Master dl380g5c
>     Resource Group: ms-sf_group:1                                                              
>  		master_slave_Stateful1:1        (ocf::heartbeat:Stateful1):     Started dl380g5d
>         master_slave_Stateful2:1        (ocf::heartbeat:Stateful2):     Started dl380g5d
> 
> ---------------------
> Apr 15 14:58:04 dl380g5d crmd: [26372]: info: send_direct_ack: ACK'ing resource op
> master_slave_Stateful1:1_monitor_11000 from 2:4:2cfefa63-808b-43fb-ae29-4dd284933c3f:
> lrm_invoke-lrmd-1208239084-21
> Apr 15 14:58:04 dl380g5d tengine: [27788]: info: process_te_message: Processing (N)ACK
> lrm_invoke-lrmd-1208239084-21 from dl380g5d
> Apr 15 14:58:04 dl380g5d tengine: [27788]: info: match_graph_event: Action
> master_slave_Stateful1:1_monitor_11000 (2) confirmed on dl380g5d (rc=0)
> Apr 15 14:58:04 dl380g5d tengine: [27788]: info: send_rsc_command: Initiating action 24:
> master_slave_Stateful1:1_promote_0 on dl380g5d
> Apr 15 14:58:04 dl380g5d crmd: [26372]: info: send_direct_ack: ACK'ing resource op
> master_slave_Stateful2:1_monitor_11000 from 1:4:2cfefa63-808b-43fb-ae29-4dd284933c3f:
> lrm_invoke-lrmd-1208239084-22
> Apr 15 14:58:05 dl380g5d tengine: [27788]: info: process_te_message: Processing (N)ACK
> lrm_invoke-lrmd-1208239084-22 from dl380g5d
> Apr 15 14:58:05 dl380g5d tengine: [27788]: info: match_graph_event: Action
> master_slave_Stateful2:1_monitor_11000 (1) confirmed on dl380g5d (rc=0)
> Apr 15 14:58:05 dl380g5d tengine: [27788]: info: send_rsc_command: Initiating action 29:
> master_slave_Stateful2:1_promote_0 on dl380g5d
> Apr 15 14:58:05 dl380g5d crmd: [26372]: info: do_lrm_rsc_op: Performing
> op=master_slave_Stateful1:1_promote_0 key=24:4:2cfefa63-808b-43fb-ae29-4dd284933c3f)
> Apr 15 14:58:05 dl380g5d lrmd: [26369]: info: rsc:master_slave_Stateful1:1: promote
> Apr 15 14:58:05 dl380g5d pengine: [27789]: WARN: process_pe_message: Transition 4: WARNINGs found
> during PE processing. PEngine Input stored in: /var/lib/heartbeat/pengine/pe-warn-317.bz2
> Apr 15 14:58:05 dl380g5d pengine: [27789]: info: process_pe_message: Configuration WARNINGs found
> during PE processing.  Please run "crm_verify -L" to identify issues.
> Apr 15 14:58:05 dl380g5d heartbeat: [26356]: WARN: Managed /usr/lib64/heartbeat/crmd process 26372
> killed by signal 11 [SIGSEGV - Segmentation violation].
> Apr 15 14:58:05 dl380g5d ccm: [26367]: info: client (pid=26372) removed from ccm
> Apr 15 14:58:05 dl380g5d pengine: [27789]: ERROR: subsystem_msg_dispatch: The server 26372 has left
> us: Shutting down...NOW
> Apr 15 14:58:05 dl380g5d tengine: [27788]: ERROR: subsystem_msg_dispatch: The server 26372 has left
> us: Shutting down...NOW
> Apr 15 14:58:05 dl380g5d heartbeat: [26356]: ERROR: Managed /usr/lib64/heartbeat/crmd process 26372
> dumped core
> Apr 15 14:58:05 dl380g5d heartbeat: [26356]: EMERG: Rebooting system.  Reason:
> /usr/lib64/heartbeat/crmd
> ---------------------
> 
> I attach log.

If you use hb_report it will create the backtraces of all core
dumps too.

Thanks,

Dejan

> Regards,
> Hideo Yamauchi.


> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/


-- 
Dejan


More information about the Linux-HA-Dev mailing list