[Linux-ha-dev] Root cause for machine lockups in CVS head version
Andrea Arcangeli
andrea at suse.de
Wed Jan 14 06:54:31 MST 2004
On Tue, Jan 13, 2004 at 09:58:08PM -0700, Alan Robertson wrote:
> If I set the priority of keventd higher than that of heartbeat, then the
> hang doesn't happen. It was 100% reproducible before. Now it doesn't seem
> to happen (!). /me bows down to Andrea.
>
> So, it does involve the kernel, it is a bug, and fixing it involves
> diddling with keventd.
yes.
However it's not clear if the fix I did so far was strong enough, I
mean, I'm unsure if you're running the kernel with the fix already
applied, in such case you shouldn't need to tweak the keventd priority.
keventd_task->policy = SCHED_RR;
keventd_task->rt_priority = MAX_RT_PRIO-1;
the above two lines should be enough to give keventd max prio, as worse
equal (not minor) to the other SCHED_RR tasks. equal prio is fine. Only
minor prio is a problem.
I wonder if I did an off-by-one mistake in the above setting of the
rt_priority or a similar thinko.
Can you verify if your kernel has the two above lines in
kernel/context.c. If you didn't upgrade the kernel post-installation a
grep in /usr/src/linux/kernel/context.c should do it.
The kernel shouldn't require tweaking of keventd prio to allow the
console to context switch under SCHED_RR load.
More information about the Linux-HA-Dev
mailing list