[ENBD] 2.4.32 more weirdness

Peter T. Breuer ptb at it.uc3m.es
Thu Mar 11 07:41:58 MST 2004


"Also sprach Anders Blomdell:"
> Still running at glacial speeds (4 servers, 4 clients on each machine), 
> following options has been used:
> 
> 
>    enbd-client server03:30010 -n 4 -i 1_disk_b -m -t 0 -b 1024 -p 60 /dev/nda
>    enbd-server 30010 /dev/hdb7 -n -i 1_disk_b -b 1024

But why is -n on the server? That will remove VMS buffering. I'm amazed it
works (thanks for testing!).

What kernel is this?

> when doing "mkfs.ext3 /dev/ndX" (four on each machine, forward progress is 
> at glacial speeds).
> 
> BUT, this is really weird:
> 
>    "mkfs.ext3 /dev/nda" has done forward progress (superblocks and 22/299 
> inode tables written)
> 
> /proc/nbdinfo says:
> 
> Device a:       Open
> [a] State:      signed, rw, enabled, validated, plug, check_partitions, 
> acct, last error 0, lives 0, bp 0
> [a] Queued:     +0R/0W curr (check 0R/0W) +4R/0W max
> [a] Buffersize: 262144  (sectors=512, blocks=256)
> [a] Blocksize:  1024    (log=10)
> [a] Size:       39070048KB
> [a] Blocks:     39070048
> [a] Sockets:    4       (+)     (+)     (+)     (*)
> [a] Requested:  5       (4)     (0)     (0)     (1)     5R/0W   max 1
> [a] Despatched: 5       (4)     (0)     (0)     (1)     5R/0W   md5 0W (0 


It has not received any requests. The kernel has not sent it any. It
dealt with what it was given. 

But it is noticable that you are running with merge_requests=0. Turn it
up to about 64. The 2.6 kernel scheduler is apparently over-affine
when tasks do i/o in less than their time slice. I wrote a long note on
this last night.


> eq, 0 ne, 0 dn)
> ##############################
> Not much progress here!!!
> ##############################
> [a] Errored:    0       (0)     (0)     (0)     (0)     0+0
> [a] Pending:    0       (0)     (0)     (0)     (0)     0R/0W+0R/0W
> [a] B/s now:    0       (0R+0W)
> [a] B/s ave:    0       (0R+0W)
> [a] B/s max:    0       (0R+0W)
> [a] Spectrum:   100%1
> [a] Kthreads:   0       (0 waiting/0 running/1 max)
> [a] Cthreads:   4       (+)     (+)     (+)     (+)
> [a] Cpids:      4       (2554)  (2555)  (2557)  (2559)


This has never been asked to do anything at all.

> Device b:       Open
> [b] State:      signed, rw, enabled, validated, plug, check_partitions, 
> acct, last error 0, lives 0, bp 0
> [b] Queued:     +0R/0W curr (check 0R/0W) +4R/0W max
> [b] Buffersize: 262144  (sectors=512, blocks=256)
> [b] Blocksize:  1024    (log=10)
> [b] Size:       39070048KB
> [b] Blocks:     39070048
> [b] Sockets:    4       (+)     (+)     (+)     (*)
> [b] Requested:  5       (4)     (0)     (0)     (1)     5R/0W   max 1
> [b] Despatched: 5       (4)     (0)     (0)     (1)     5R/0W   md5 0W (0 
> eq, 0 ne, 0 dn)

Nothing mush here either.

> [b] Errored:    0       (0)     (0)     (0)     (0)     0+0
> [b] Pending:    0       (0)     (0)     (0)     (0)     0R/0W+0R/0W
> [b] B/s now:    0       (0R+0W)
> [b] B/s ave:    0       (0R+0W)
> [b] B/s max:    0       (0R+0W)
> [b] Spectrum:   100%1
> [b] Kthreads:   0       (0 waiting/0 running/1 max)
> [b] Cthreads:   4       (+)     (+)     (+)     (+)
> [b] Cpids:      4       (2556)  (2558)  (2560)  (2561)
> Device c:       Open
> [c] State:      signed, rw, enabled, validated, plug, check_partitions, 
> acct, last error 0, lives 0, bp 0
> [c] Queued:     +0R/0W curr (check 0R/0W) +4R/0W max
> [c] Buffersize: 262144  (sectors=512, blocks=256)
> [c] Blocksize:  1024    (log=10)
> [c] Size:       39070048KB
> [c] Blocks:     39070048
> [c] Sockets:    4       (+)     (+)     (+)     (*)
> [c] Requested:  5       (4)     (0)     (0)     (1)     5R/0W   max 1
> [c] Despatched: 5       (4)     (0)     (0)     (1)     5R/0W   md5 0W (0 
> eq, 0 ne, 0 dn)


Same kind of thing.


> [c] Errored:    0       (0)     (0)     (0)     (0)     0+0
> [c] Pending:    0       (0)     (0)     (0)     (0)     0R/0W+0R/0W
> [c] B/s now:    0       (0R+0W)
> [c] B/s ave:    0       (0R+0W)
> [c] B/s max:    0       (0R+0W)
> [c] Spectrum:   100%1
> [c] Kthreads:   0       (0 waiting/0 running/1 max)
> [c] Cthreads:   4       (+)     (+)     (+)     (+)
> [c] Cpids:      4       (2562)  (2564)  (2566)  (2568)
> Device d:       Open
> [d] State:      signed, rw, enabled, validated, plug, md5sum, 
> check_partitions, acct, last error 0, lives 0, bp 0
> [d] Queued:     +0R/1016W curr (check 0R/1001W) +4R/1024W max
> [d] Buffersize: 262144  (sectors=512, blocks=256)
> [d] Blocksize:  1024    (log=10)
> [d] Size:       39070048KB
> [d] Blocks:     39070048
> [d] Sockets:    4       (+)     (+)     (+)     (*)
> [d] Requested:  105.40K (26.2K) (26.0K) (26.0K) (26.0K) 5R/105.3KW      max 
> 1
> [d] Despatched: 104.40K (26.2K) (26.0K) (26.0K) (26.0K) 5R/104.4KW      md5 

Ah - now this one is going hell for leather perfectly fine.

Yes, this looks a typical 2.6 scheduler affinity problem. It has given
only one set of tasks work to do. 


Unless of course, it is simply that all requests are going to the same
device! Anyone ever tested two devices under 2.6 kernel? Not me.


> 88.3KW (88.3K eq, 0 ne, 0 dn)
> [d] Errored:    0       (0)     (0)     (0)     (0)     0+0
> [d] Pending:    4       (1)     (1)     (1)     (1)     0R/4W+0R/1001W
> [d] B/s now:    31.0K   (0R+31.0KW)
> [d] B/s ave:    73.0K   (0R+73.0KW)
> [d] B/s max:    1.78M   (0R+1.78MW)
> [d] Spectrum:   100%1

Oww. Please turn merge_requests up to something larger.

> [d] Kthreads:   0       (0 waiting/0 running/1 max)
> [d] Cthreads:   0       (-)     (-)     (-)     (-)
> [d] Cpids:      0       (2565)  (2567)  (2570)  (2571)
> 
> The four client thread does this:
> 
> 1079010719.872660 ioctl(4, 0x4004ab0b, 0x40018008) = -1 ETIME (Timer 
> expired)

Well, it just enters the kernel and waits for work to do. After a few
seconds it gives up and reenters.

> 
> 1079010780.005881 ioctl(4, 0x4004ab0b, 0x40018008) = -1 ETIME (Timer expired)
> 
> 
> All forward progress seems to be reported on device d, but all devices are 
> (very slowly) moving forward

Kernel problem. Turn merge_requests up to some large value, and figure
out how to modify scheduler affinity (then tell me).

Peter


More information about the ENBD mailing list