[Linux-ha-dev] Hb-2.08/stable: crmd crashes on startupundersolaris10/i386

Andrew Beekhof beekhof at gmail.com
Mon Jun 4 03:15:10 MDT 2007


On 6/4/07, Otte, Joerg <joerg.otte at nsn.com> wrote:
> Got the gdb running, fsa_our_uname is NULL.

ok, so the patch i supplied was correct.

thanks for verifying that!

>
>
> #0  0xfead4c7c in strlen () from /lib/libc.so.1
> (gdb) where
> #0  0xfead4c7c in strlen () from /lib/libc.so.1
> #1  0xfeb2a296 in _ndoprnt () from /lib/libc.so.1
> #2  0xfeb2d3cb in vsnprintf () from /lib/libc.so.1
> #3  0xfef885b8 in cl_log (priority=6,
>     fmt=0x8072548 "%s: %s: State transition %s -> %s [ input=%s cause=%s origin=%s ]")
>     at cl_log.c:523
> #4  0x080571af in do_state_transition (actions=16777216, cur_state=S_STARTING,
>     next_state=S_RECOVERY, msg_data=0x8091b10) at fsa.c:602
> #5  0x080579d2 in s_crmd_fsa (cause=C_FSA_INTERNAL) at fsa.c:301
> #6  0x0805fc3d in crm_fsa_trigger (user_data=0x0) at callbacks.c:661
> #7  0xfef93857 in G_TRIG_dispatch (source=0x8090c58, callback=0, user_data=0x0) at GSource.c:1349
> #8  0xfedbc77f in g_main_context_dispatch () from /usr/local/lib/libglib-2.0.so.0
> #9  0xfedbe065 in g_main_context_iterate () from /usr/local/lib/libglib-2.0.so.0
> #10 0xfedbe2c0 in g_main_loop_run () from /usr/local/lib/libglib-2.0.so.0
> #11 0x080554d7 in crmd_init () at main.c:155
> #12 0x08055748 in main (argc=1, argv=0x80478d0) at main.c:122
> (gdb) up
> #1  0xfeb2a296 in _ndoprnt () from /lib/libc.so.1
> (gdb) up
> #2  0xfeb2d3cb in vsnprintf () from /lib/libc.so.1
> (gdb) up
> #3  0xfef885b8 in cl_log (priority=6,
>     fmt=0x8072548 "%s: %s: State transition %s -> %s [ input=%s cause=%s origin=%s ]")
>     at cl_log.c:523
> 523             nbytes=vsnprintf(buf, sizeof(buf), fmt, ap);
> (gdb) up
> #4  0x080571af in do_state_transition (actions=16777216, cur_state=S_STARTING,
>     next_state=S_RECOVERY, msg_data=0x8091b10) at fsa.c:602
> 602             crm_info("%s: State transition %s -> %s [ input=%s cause=%s origin=%s ]",
> (gdb) l
> 597
> 598             do_dot_log(DOT_PREFIX"\t%s -> %s [ label=%s cause=%s origin=%s ]",
> 599                       state_from, state_to, input, fsa_cause2string(cause),
> 600                       msg_data->origin);
> 601
> 602             crm_info("%s: State transition %s -> %s [ input=%s cause=%s origin=%s ]",
> 603                      fsa_our_uname, state_from, state_to, input,
> 604                      fsa_cause2string(cause), msg_data->origin);
> 605
> 606             /* the last two clauses might cause trouble later */
> (gdb) p fsa_our_uname
> $1 = 0x0
> (gdb) p state_from
> $2 = 0x80713fa "S_STARTING"
> (gdb) p state_to
> $3 = 0x8071412 "S_RECOVERY"
> (gdb)  p input
> $4 = 0x807131e "I_ERROR"
> (gdb) p cause
> $5 = C_FSA_INTERNAL
> (gdb) pmsg_data->
> Undefined command: "pmsg".  Try "help".
> (gdb) p msg_data->
> A syntax error in expression, near `'.
> (gdb) p msg_data
> $6 = (fsa_data_t *) 0x8091b10
> (gdb) p *msg_data
> $7 = {id = 32, fsa_input = I_ERROR, fsa_cause = C_FSA_INTERNAL, actions = 0,
>   origin = 0x806ec58 "do_cib_control", data = 0x0, data_type = fsa_dt_none}
> (gdb)
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: linux-ha-dev-bounces at lists.linux-ha.org [mailto:linux-ha-dev-bounces at lists.linux-ha.org] Im Auftrag von ext Otte, Joerg
> Gesendet: Freitag, 1. Juni 2007 14:51
> An: High-Availability Linux Development List
> Betreff: AW: [Linux-ha-dev] Hb-2.08/stable: crmd crashes on startupundersolaris10/i386
>
> Thanks for the patch.
>
> >Is this the best stack trace thats available?
> >It'd be nice to know which variable is the problem.
>
> That is what the Solaris stack dump tool displays. Next week I will
> try to install a gdb on the machine. I think the gdb can show
> more Details.
>
>
> -----Ursprüngliche Nachricht-----
> Von: linux-ha-dev-bounces at lists.linux-ha.org [mailto:linux-ha-dev-bounces at lists.linux-ha.org] Im Auftrag von ext Andrew Beekhof
> Gesendet: Freitag, 1. Juni 2007 10:49
> An: High-Availability Linux Development List
> Betreff: Re: [Linux-ha-dev] Hb-2.08/stable: crmd crashes on startup undersolaris10/i386
>
> On 5/31/07, Otte, Joerg <joerg.otte at nsn.com> wrote:
> >
> > I must report anather crash. This time in crmd.
> > After a reboot the following crash occures every
> > minute again and again.
> >
> > This crash prevents Heartbeat from starting up.
> >
> > fead4c7c strlen   (807254e, 8047698, 8046230, 0) + c
> > feb2d3cb vsnprintf (8046270, 1400, 8072548, 8047698) + 73
> > fef885b8 cl_log   (6, 8072548, 806e14d, 0, 80713fa, 8071412) + 58
> > 080571af do_state_transition (1000000, 0, 8, 6, 8091b10, 0) + 21b
> > 080579d2 s_crmd_fsa (d, 8090c58, 8047738, 805fc1b) + 20e
> > 0805fc3d crm_fsa_trigger (0, 0, 8047778, fef9380f) + 2d
> > fef93857 G_TRIG_dispatch (8090c58, 0, 0, 0) + a7
> > fedbc77f g_main_context_dispatch (808eb80, ffffff9c, 8089d90, 0) + 1e7
> > fedbe065 g_main_context_iterate (1, 808c2c0, 8047858, fedbe141, 80554d7, 1) + 41d
> > fedbe2c0 g_main_loop_run (8089d78, 806f228, 806ca84, 806f1fd, 0, 0) + 19c
> > 080554d7 crmd_init (80478a0, 80553f1, 8088fcc, 8088fdc, 0, 806ca02) + b3
> > 08055748 main     (1, 80478d0, 80478d8) + f0
> > 080552cc _start   (1, 8047a40, 0, 8047a66, 8047a78, 8047a87) + 80
>
> Is this the best stack trace thats available?
> It'd be nice to know which variable is the problem.
>
> But based on the small offset, i _think_ this patch might help.
> They're the only variables that I can imagine might be NULL - even
> though neither should be.
>
> diff -r b8dc94304c0c crm/crmd/fsa.c
> --- a/crm/crmd/fsa.c    Tue May 29 15:03:42 2007 +0200
> +++ b/crm/crmd/fsa.c    Fri Jun 01 10:49:07 2007 +0200
> @@ -599,9 +599,9 @@ do_state_transition(long long actions,
>                   state_from, state_to, input, fsa_cause2string(cause),
>                   msg_data->origin);
>
> -       crm_info("%s: State transition %s -> %s [ input=%s cause=%s
> origin=%s ]",
> -                fsa_our_uname, state_from, state_to, input,
> -                fsa_cause2string(cause), msg_data->origin);
> +       crm_info("State transition %s -> %s [ input=%s cause=%s origin=%s ]",
> +                state_from, state_to, input, fsa_cause2string(cause),
> +                msg_data->origin);
>
>         /* the last two clauses might cause trouble later */
>         if(election_timeout != NULL
> diff -r b8dc94304c0c crm/crmd/messages.c
> --- a/crm/crmd/messages.c       Tue May 29 15:03:42 2007 +0200
> +++ b/crm/crmd/messages.c       Fri Jun 01 10:48:16 2007 +0200
> @@ -90,6 +90,7 @@ register_fsa_input_adv(
>         fsa_data_t *fsa_data = NULL;
>
>         last_data_id++;
> +       CRM_CHECK(raised_from != NULL, raised_from = "<unknown>");
>
>         crm_debug("%s %s FSA input %d (%s) (cause=%s) %s data",
>                   raised_from,
> prepend?"prepended":"appended",last_data_id, fsa_input2string(input),
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>


More information about the Linux-HA-Dev mailing list