[ENBD] 2.4.32 more weirdness
Anders Blomdell
anders.blomdell at control.lth.se
Thu Mar 11 08:23:54 MST 2004
> "Also sprach Anders Blomdell:"
>> Still running at glacial speeds (4 servers, 4 clients on each machine),
>> following options has been used:
>>
>>
>> enbd-client server03:30010 -n 4 -i 1_disk_b -m -t 0 -b 1024 -p 60 /dev/
>> nda
>> enbd-server 30010 /dev/hdb7 -n -i 1_disk_b -b 1024
>
> But why is -n on the server? That will remove VMS buffering. I'm amazed it
> works (thanks for testing!).
It was your suggestion! (O_DIRECT)
>
> What kernel is this?
2.4.25
>
>> when doing "mkfs.ext3 /dev/ndX" (four on each machine, forward progress
>> is
>> at glacial speeds).
>>
>> BUT, this is really weird:
>>
>> "mkfs.ext3 /dev/nda" has done forward progress (superblocks and 22/299
>> inode tables written)
>>
>> /proc/nbdinfo says:
>>
>> Device a: Open
>> [a] State: signed, rw, enabled, validated, plug, check_partitions,
>> acct, last error 0, lives 0, bp 0
>> [a] Queued: +0R/0W curr (check 0R/0W) +4R/0W max
>> [a] Buffersize: 262144 (sectors=512, blocks=256)
>> [a] Blocksize: 1024 (log=10)
>> [a] Size: 39070048KB
>> [a] Blocks: 39070048
>> [a] Sockets: 4 (+) (+) (+) (*)
>> [a] Requested: 5 (4) (0) (0) (1) 5R/0W max 1
>> [a] Despatched: 5 (4) (0) (0) (1) 5R/0W md5 0W (0
>
>
> It has not received any requests. The kernel has not sent it any. It
> dealt with what it was given.
>
> But it is noticable that you are running with merge_requests=0. Turn it
> up to about 64. The 2.6 kernel scheduler is apparently over-affine
> when tasks do i/o in less than their time slice. I wrote a long note on
> this last night.
still 2.4.25
echo 'merge_requests=64' > /proc/nbdinfo ??
>
>
>> eq, 0 ne, 0 dn)
>> ##############################
>> Not much progress here!!!
>> ##############################
>> [a] Errored: 0 (0) (0) (0) (0) 0+0
>> [a] Pending: 0 (0) (0) (0) (0) 0R/0W+0R/0W
>> [a] B/s now: 0 (0R+0W)
>> [a] B/s ave: 0 (0R+0W)
>> [a] B/s max: 0 (0R+0W)
>> [a] Spectrum: 100%1
>> [a] Kthreads: 0 (0 waiting/0 running/1 max)
>> [a] Cthreads: 4 (+) (+) (+) (+)
>> [a] Cpids: 4 (2554) (2555) (2557) (2559)
>
>
> This has never been asked to do anything at all.
>
>> Device b: Open
>> [b] State: signed, rw, enabled, validated, plug, check_partitions,
>> acct, last error 0, lives 0, bp 0
>> [b] Queued: +0R/0W curr (check 0R/0W) +4R/0W max
>> [b] Buffersize: 262144 (sectors=512, blocks=256)
>> [b] Blocksize: 1024 (log=10)
>> [b] Size: 39070048KB
>> [b] Blocks: 39070048
>> [b] Sockets: 4 (+) (+) (+) (*)
>> [b] Requested: 5 (4) (0) (0) (1) 5R/0W max 1
>> [b] Despatched: 5 (4) (0) (0) (1) 5R/0W md5 0W (0
>> eq, 0 ne, 0 dn)
>
> Nothing mush here either.
>
>> [b] Errored: 0 (0) (0) (0) (0) 0+0
>> [b] Pending: 0 (0) (0) (0) (0) 0R/0W+0R/0W
>> [b] B/s now: 0 (0R+0W)
>> [b] B/s ave: 0 (0R+0W)
>> [b] B/s max: 0 (0R+0W)
>> [b] Spectrum: 100%1
>> [b] Kthreads: 0 (0 waiting/0 running/1 max)
>> [b] Cthreads: 4 (+) (+) (+) (+)
>> [b] Cpids: 4 (2556) (2558) (2560) (2561)
>> Device c: Open
>> [c] State: signed, rw, enabled, validated, plug, check_partitions,
>> acct, last error 0, lives 0, bp 0
>> [c] Queued: +0R/0W curr (check 0R/0W) +4R/0W max
>> [c] Buffersize: 262144 (sectors=512, blocks=256)
>> [c] Blocksize: 1024 (log=10)
>> [c] Size: 39070048KB
>> [c] Blocks: 39070048
>> [c] Sockets: 4 (+) (+) (+) (*)
>> [c] Requested: 5 (4) (0) (0) (1) 5R/0W max 1
>> [c] Despatched: 5 (4) (0) (0) (1) 5R/0W md5 0W (0
>> eq, 0 ne, 0 dn)
>
>
> Same kind of thing.
>
>
>> [c] Errored: 0 (0) (0) (0) (0) 0+0
>> [c] Pending: 0 (0) (0) (0) (0) 0R/0W+0R/0W
>> [c] B/s now: 0 (0R+0W)
>> [c] B/s ave: 0 (0R+0W)
>> [c] B/s max: 0 (0R+0W)
>> [c] Spectrum: 100%1
>> [c] Kthreads: 0 (0 waiting/0 running/1 max)
>> [c] Cthreads: 4 (+) (+) (+) (+)
>> [c] Cpids: 4 (2562) (2564) (2566) (2568)
>> Device d: Open
>> [d] State: signed, rw, enabled, validated, plug, md5sum,
>> check_partitions, acct, last error 0, lives 0, bp 0
>> [d] Queued: +0R/1016W curr (check 0R/1001W) +4R/1024W max
>> [d] Buffersize: 262144 (sectors=512, blocks=256)
>> [d] Blocksize: 1024 (log=10)
>> [d] Size: 39070048KB
>> [d] Blocks: 39070048
>> [d] Sockets: 4 (+) (+) (+) (*)
>> [d] Requested: 105.40K (26.2K) (26.0K) (26.0K) (26.0K) 5R/105.3KW
>> max
>> 1
>> [d] Despatched: 104.40K (26.2K) (26.0K) (26.0K) (26.0K) 5R/104.4KW
>> md5
>
> Ah - now this one is going hell for leather perfectly fine.
>
> Yes, this looks a typical 2.6 scheduler affinity problem. It has given
> only one set of tasks work to do.
>
>
> Unless of course, it is simply that all requests are going to the same
> device! Anyone ever tested two devices under 2.6 kernel? Not me.
Still 2.4.25
>
>
>> 88.3KW (88.3K eq, 0 ne, 0 dn)
>> [d] Errored: 0 (0) (0) (0) (0) 0+0
>> [d] Pending: 4 (1) (1) (1) (1) 0R/4W+0R/1001W
>> [d] B/s now: 31.0K (0R+31.0KW)
>> [d] B/s ave: 73.0K (0R+73.0KW)
>> [d] B/s max: 1.78M (0R+1.78MW)
>> [d] Spectrum: 100%1
>
> Oww. Please turn merge_requests up to something larger.
>
>> [d] Kthreads: 0 (0 waiting/0 running/1 max)
>> [d] Cthreads: 0 (-) (-) (-) (-)
>> [d] Cpids: 0 (2565) (2567) (2570) (2571)
>>
>> The four client thread does this:
>>
>> 1079010719.872660 ioctl(4, 0x4004ab0b, 0x40018008) = -1 ETIME (Timer
>> expired)
>
> Well, it just enters the kernel and waits for work to do. After a few
> seconds it gives up and reenters.
>
>>
>> 1079010780.005881 ioctl(4, 0x4004ab0b, 0x40018008) = -1 ETIME (Timer
>> expired)
>>
>>
>> All forward progress seems to be reported on device d, but all devices are
>> (very slowly) moving forward
>
> Kernel problem. Turn merge_requests up to some large value, and figure
> out how to modify scheduler affinity (then tell me).
More information about the ENBD
mailing list