[ENBD] fr1 hangs when trying to access raid device..

Arve Emil Myrås enbd@lists.community.tummy.com
Tue, 4 Feb 2003 14:42:59 +0100


"A month of sundays ago [Arve Emil Myr_s] wrote:"
>> I'm trying to get the fr1 module working on a dual-athlon system.
>That's which fr1? 1.0 or 1.1 (I am on 1.3 here, so I need reminding).

That would be 1.0 (the one form ftp with datestamp 23/1-03)

>> My kernel is 2.4.20 with the vserver patch version ctx-16 compiled =
for athlon cpu with smp.
>> Everything goes fine until I try to acess the raid device.
>
>Doesn't sound so fine!
>
>> I have the fr1 loaded om major 9 and had raid working before i tryed =
this,
>
>You loaded it with major=9.

yes..

>> using my old raidtab who looks like: 
>> 
>> # autogenerated /etc/raidtab by YaST2 
>> 
>> 
>> raiddev /dev/md0
>>    raid-level       1
>>    nr-raid-disks    2
>>    nr-spare-disks   0
>>    persistent-superblock 1
>
>Do it without the superblock. I didn't take any account of those,
>because I don't know the format or the semantics.

OK, have removed the superblock statement.. Everithing is working like before..

>> 
>> I can do a "mkraid --really-force /dev/md0" ; and everything looks =
normal..
>
>Sounds good.
>
>> But, when i try to "mke2fs  /dev/md0" ewerything just freezes (well =
not quite; Keyboard, mouse & X hangs; I can still ping the box but 
>>ssh-login is impossible) only way out is a hardware reset..
>
>This is bad. And it's not showing up in the list below either ...
>
>> My syslog looks like this form loading the module until "mke2fs":
>> 
..snip..
>> Feb  4 12:37:38 vserv kernel: fr1 added new device 08:12 to f3b62600 =
with err 0
>> Feb  4 12:37:38 vserv kernel: fr1 ioctl 400c0930
>
>And then no messages. At this point it has simply set up the device.
>Tell me, did you compile the driver with the -D__SMP__ directive? I
>am tempted to say that that is an SMP specific lockup.
yes used the -D__SMP__  .

>
>> Hope someone could help me out of this deadlock...
>> If you need any more info just ask...

>Can you try on a nonsmp machine, or boot with "nosmp", and see if it
>makes a difference? Your symptoms match what I would expect if the
>kernel iolock had been taken and not released. That would only happen
>if the kernel and the driver had mismatched expectations over locking
>conventions.

running to recompile a non-smp kernel now,, my raid controllers didn't  like the nosmp statement...

-Arve Emil