[ENBD] fr1 hangs when trying to access raid device..
Peter T. Breuer
enbd@lists.community.tummy.com
Thu, 6 Feb 2003 21:26:11 +0100 (MET)
"A month of sundays ago [Arve Emil Myr_s] wrote:"
> Ok,, made some printk's and found the spot where it burns,,, still running on a no-smp kernel with smp disabled in bios,,
Where? (no smp cannot fail - I will be interested to see this ..)
> Feb 6 16:54:19 vserv kernel: saw bh block 0 sector 0 size 1024 state 11d dev f000 rdev f000 on req f79113c0
> Feb 6 16:54:19 vserv kernel: submitting bh sector 0 size 1024 state 1e dev 700 rdev 700 on req f79113c0
> Feb 6 16:54:19 vserv kernel: serviced req f79113c0 on component 0
> Feb 6 16:54:19 vserv kernel: AEM: promote_req just after loop, req= f79113c0 , e= 0
> Feb 6 16:54:19 vserv kernel: AEM: promote_req just after atomic_set_mask, req= f79113c0 , e= 0
> Feb 6 16:54:19 vserv kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004
> printk (KERN_DEBUG "AEM: promote_req just after atomic_set_mask, req= %x , e= %d\n", req, e->index);
>
> // PTB comes off e
> list_del (&req->queue);
> printk (KERN_DEBUG "AEM: promote_req just after list_del, req= %x , e= %d\n", req, e->index);
well, it dies in list_del. OK. But the req is not zero. OK. I think I
might understand this. The queue lock is still held too.
The pointers on the req queue must be messed up. I think I noticed that
the kernel people had at some point stopped tidying pointers after a
list del.
/**
* list_del - deletes entry from list.
* @entry: the element to delete from the list.
* Note: list_empty on entry does not return true after this, the
* entry is in an undefined state.
*/
static __inline__ void list_del(struct list_head *entry)
{
__list_del(entry->prev, entry->next);
}
and we miss INIT_LIST_HEAD(&req->queue); afterwards. Can you add that
after the list_del? Or equivalently replace list_del with list_del_init
everywhere that list_del occurs.
This might be a simple thinko. Maybe we're not on a queue at all.
But try using list_del _int first.
If we're not on a queue at all, it might be wise to test with
if (!list_empty(&req->queue)) list_del_init(&req->queue);
but see how the list_del_init helps first. I don't see the problem
here, but that must be good fortune. Or a different compiler.
Peter