[ENBD] fr1 hangs when trying to access raid device..
Peter T. Breuer
enbd@lists.community.tummy.com
Tue, 4 Feb 2003 13:46:03 +0100 (MET)
"A month of sundays ago [Arve Emil Myr_s] wrote:"
> I'm trying to get the fr1 module working on a dual-athlon system.
That's which fr1? 1.0 or 1.1 (I am on 1.3 here, so I need reminding).
> My kernel is 2.4.20 with the vserver patch version ctx-16 compiled for athlon cpu with smp.
> Everything goes fine until I try to acess the raid device.
Doesn't sound so fine!
> I have the fr1 loaded om major 9 and had raid working before i tryed this,
You loaded it with major=9.
> using my old raidtab who looks like:
>
> # autogenerated /etc/raidtab by YaST2
>
>
> raiddev /dev/md0
> raid-level 1
> nr-raid-disks 2
> nr-spare-disks 0
> persistent-superblock 1
Do it without the superblock. I didn't take any account of those,
because I don't know the format or the semantics.
> chunk-size 4096
> device /dev/sdb1
> raid-disk 0
> device /dev/sdb2
> raid-disk 1
>
> I can do a "mkraid --really-force /dev/md0" ; and everything looks normal..
Sounds good.
> But, when i try to "mke2fs /dev/md0" ewerything just freezes (well not quite; Keyboard, mouse & X hangs; I can still ping the box but ssh-login is impossible) only way out is a hardware reset..
This is bad. And it's not showing up in the list below either ...
> My syslog looks like this form loading the module until "mke2fs":
>
> Feb 4 12:37:18 vserv kernel: fr1 ioctl 800c0910
> Feb 4 12:37:18 vserv kernel: klogd 1.4.1, ---------- state change ----------
> Feb 4 12:37:18 vserv kernel: Cannot find map file.
> Feb 4 12:37:18 vserv kernel: Loaded 563 symbols from 6 modules.
> Feb 4 12:37:18 vserv kernel: fr1 ioctl 800c0910
> Feb 4 12:37:33 vserv last message repeated 2 times
> Feb 4 12:37:38 vserv kernel: fr1 ioctl 40480923
> Feb 4 12:37:38 vserv kernel: fr1 ioctl 40140921
> Feb 4 12:37:38 vserv kernel: fr1 hotadd component 08:11[0] to device 0
> Feb 4 12:37:38 vserv kernel: fr1 added new device 08:11 to f3b62600 with err 0
> Feb 4 12:37:38 vserv kernel: fr1 ioctl 40140921
> Feb 4 12:37:38 vserv kernel: fr1 hotadd component 08:12[1] to device 0
> Feb 4 12:37:38 vserv kernel: fr1 added new device 08:12 to f3b62600 with err 0
> Feb 4 12:37:38 vserv kernel: fr1 ioctl 400c0930
And then no messages. At this point it has simply set up the device.
Tell me, did you compile the driver with the -D__SMP__ directive? I
am tempted to say that that is an SMP specific lockup.
> Hope someone could help me out of this deadlock...
> If you need any more info just ask...
Can you try on a nonsmp machine, or boot with "nosmp", and see if it
makes a difference? Your symptoms match what I would expect if the
kernel iolock had been taken and not released. That would only happen
if the kernel and the driver had mismatched expectations over locking
conventions.
Peter