[Linux-HA] possible bug in hb_resource.c
greg at max-t.com
Thu Dec 14 19:56:44 MST 2006
I'm new to the list so apologies if this is a silly question but I'm
seeing what seems to be a bug in the handling of a failure of a stonith
event. See log excerpt below:
heartbeat: 2006/12/14_15:49:44 WARN: node sledgehammer: is dead
heartbeat: 2006/12/14_15:49:44 info: Link sledgehammer:eth1 dead.
heartbeat: 2006/12/14_15:49:44 info: Resetting node sledgehammer with
[ipmilan STONITH device]
heartbeat: 2006/12/14_15:49:54 ERROR: Host sledgehammer not reset!
heartbeat: 2006/12/14_15:49:54 WARN: Exiting STONITH sledgehammer
process 6351 returned rc 1.
heartbeat: 2006/12/14_15:49:54 ERROR: STONITH of sledgehammer failed.
heartbeat: 2006/12/14_15:49:59 info: Resetting node ^R with [ipmilan
heartbeat: 2006/12/14_15:49:59 ERROR: Host ^R not reset!
heartbeat: 2006/12/14_15:49:59 WARN: Exiting STONITH ^R process 6352
returned rc 1.
heartbeat: 2006/12/14_15:49:59 ERROR: STONITH of ^R failed. Retrying...
heartbeat: 2006/12/14_15:50:04 info: Resetting node b556^Q with [ipmilan
heartbeat: 2006/12/14_15:50:04 ERROR: Host b556^Q not reset!
heartbeat: 2006/12/14_15:50:04 WARN: Exiting STONITH b556^Q process 6409
returned rc 1.
heartbeat: 2006/12/14_15:50:04 ERROR: STONITH of b556^Q failed.
heartbeat: 2006/12/14_15:50:09 info: Resetting node b7a8^Q with [ipmilan
heartbeat: 2006/12/14_15:50:09 ERROR: Host b7a8^Q not reset!
This is an endless loop
As you can see it looks like after the first try somehow the host to
reset is screwed up like its been freed somewhere. This is version 1.2.4
but the code seems to be the same in 1.2.5. Unfortunately I have other
problems with 2.0.7.
Has anyone seen this before?
just a guy
More information about the Linux-HA