[Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: membership bygshifrom
kshaikh at consensys.com
Wed Feb 16 10:25:52 MST 2005
> > There's probably some system call that /sbin/reboot -f uses...
> I'd make some changes: the sysrq trigger "b" (or "o" for a poweroff) is
> even better, and init 6 isn't an option. That one might kill some of our
> processes, but get stuck in the shutdown sequence somewhere.
> "reboot -nf" or "poweroff -nf" is what ought to work.
> > memcpy from /dev/zero onto /dev/kmem would probably also do it :-).
> I know this was meant as a joke, but: Causing corruption is not
> approved. This would hang the machine, but not necessarily reboot/halt
> it. That's the same reason I'm objecting to "init 6" ;-)
I've tried those forceful reboot/poweroff commands -- and while they work --
they execute some linux kernel registered notifier hooks, causing system to
hang if its already sick. A better method is to use some onboard watchdog
timer to reboot the machine. This is what I have done in the past:
- start a hardware watchdog timer set for 2 minutes to reboot/power-cycle
- begin an init 6
- if init 6 gets stuck anywhere for more than 2 minutes, you are guaranteed
the watchdog will kick in
- otherwise init 6 will reboot the box
This atleast gets you to shutdown the machine gracefully, possibly dumping
debug info/system state into a log, then rebooting. The tricky part is the
stonith code will have to tell others, "I'm going to die so mark me dead,
and don't do any special processing other than just take over my resources,
if I don't retake my resources in 5 minutes"
> Lars Marowsky-Brée <lmb at suse.de>
> High Availability & Clustering
> SUSE Labs, Research and Development
> SUSE LINUX Products GmbH - A Novell Business
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> Home Page: http://linux-ha.org/
> No virus found in this incoming message.
> Checked by AVG Anti-Virus.
> Version: 7.0.300 / Virus Database: 265.8.8 - Release Date: 2/14/2005
More information about the Linux-HA-Dev