[ENBD] enbd: help needed

Peter T. Breuer ptb at inv.it.uc3m.es
Wed Oct 17 10:58:03 MDT 2007


OK .. I can read the mail now.

"Also sprach Peter Daum:"
> 
> - compile enbd-2.4.33
> insmod ./enbd.ko

Missing enbd_ioctl.ko. I don't know how or if it runs without it in 2.6
(I've never tried).

> ./enbd-server 666 /dev/raid6_12/nbd # lvm volume

I wouldn't test using something like that. Use a small file of about
4MB.

> ./enbd-client localhost:666 -n 4 /dev/nda

OK.

> The system is core2duo with linux 2.6.16.54.

What's a "core2duo" and how have you compiled the module? You HAVE 
compiled the kernel source beforehand, no?  It has to be conditioned.
It must match your running kernel exactly.

> I had tried enbd-2.4.34, too, which failed in a different way ;)

Well, show.


> # cat /proc/nbdinfo
> 
> Device a:       Open [a] State:      verify, rw, enabled, remote invalid, last error 0, lives 0, bp 0
> [a] Queued:     +0R/0W curr (check 0R/0W) +0R/0W max
> [a] Buffersize: 262144  (sectors=512, blocks=64)
> [a] Blocksize:  4096    (log=12)
> [a] Size:       838860800KB
> [a] Blocks:     209715200
> [a] Sockets:    4       (+)     (+)     (+)     (*)
> [a] Requested:  0       (0)     (0)     (0)     (0)     0R/0W   max 0
> [a] Despatched: 0       (0)     (0)     (0)     (0)     0R/0W   md5 0W (0 eq, 0 ne, 0 dn)
> [a] Errored:    306     (0)     (0)     (0)     (0)     0+306
> [a] Pending:    0       (0)     (0)     (0)     (0)     0R/0W+0R/0W
> [a] B/s now:    0       (0R+0W)
> [a] B/s ave:    0       (0R+0W)
> [a] B/s max:    0       (0R+0W)
> [a] Spectrum:
> [a] Kthreads:   0       (0 waiting/0 running/1 max)
> [a] Cthreads:   4       (+)     (+)     (+)     (+)
> [a] Cpids:      4       (5906)  (5907)  (5908)  (5909)
> Device b-p:     Closed

All looks fine. Apart from the size, which is a bit large. 838 GB?

But it has certainly talked to the server - or it couldn't get the size.
What did you compile with? The compiler has to match your kernel's
compiler too!

> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869057
> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869058
> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869059
> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869056

looks to me about % expr 17 179 869 056 \* 4KB  = 68 719 476 224 KB  ...
isn't that a bit far right to be talking to it?

It should error.

> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869057
> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869058
> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869059
> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869180
> 2007-10-16 19:59:10 Buffer I/O error on device nda, logical block 17179869181
> 2007-10-16 19:59:10 ENBD #3045[0]: enbd_set_remote_invalid INVALIDATE drive on ndb
> 2007-10-16 19:59:10 ENBD #3045[1]: enbd_set_remote_invalid INVALIDATE drive on nda
> 2007-10-16 19:59:10 ENBD #3045[2]: enbd_set_remote_invalid INVALIDATE drive on ndc

That's something I've never seen, but it says that the server end has
told it that the resource is not available, I believe. Why all these
different devices?

Oh, I see, somebody is accessing it before startup.

> 2007-10-16 19:59:19 root: start server:
> 2007-10-16 19:59:29 enbd-server: enbd-server: server: will open resources in mode linear 2007-10-16
> 19:59:29 enbd-server: file: looking for blksize of /dev/raid6_12/nbd with fstat... 2007-10-16
> 19:59:29 enbd-server: file: blksize of /dev/raid6_12/nbd is 4096 2007-10-16 19:59:29 enbd-server:
> file: set final blksize of whole resource to 4096 2007-10-16 19:59:29 enbd-server: file: looking for
> size of fd 4 with seek SEEK_END... 2007-10-16 19:59:29 enbd-server: file: set size of fd 4 to
> 858993459200 2007-10-16 19:59:29 enbd-server: enbd-server: server: set blksize to 4096. 2007-10-16
> 19:59:29 enbd-server: enbd-server: size of exported file/device is 858993459200B (209715200 blocks)
> 2007-10-16 19:59:29 enbd-server: enbd-server: server (-2) set new signal handlers 2007-10-16

Seems to have a server. 209 million blocks of 4KB each.

> 19:59:51 root: start client:
> 2007-10-16 20:00:00 enbd-client: enbd-client: client channels is 4 2007-10-16 20:00:00 enbd-client:
> enbd-client: client says target 0 is localhost:666 2007-10-16 20:00:00 enbd-client: enbd-client:
> client (-1) opened device /dev/nda 2007-10-16 20:00:00 enbd-client: enbd-client: client (-1) opened
> NBD device /dev/nda (2b00) 2007-10-16 20:00:00 enbd-client: enbd-client: client (-1) left kernel
> bdflush sync boundary at 80% 2007-10-16 20:00:00 enbd-client: enbd-client: client (-1) left kernel
> bdflush async boundary at 10% 2007-10-16 20:00:00 enbd-client: enbd-client: client (-1) detaches
> from shell 2007-10-16 20:00:00 enbd-client: enbd-client: client (-1) starts introduction sequence on
> localhost:666 2007-10-16 20:00:00 enbd-server: enbd-server: server (-2) opened port 666 (socket 1)
> for client 127.0.0.1 2007-10-16 20:00:00 enbd-server: nbd-shmem: shmem area total size 139264
> 2007-10-16 20:00:00 enbd-server: nbd-shmem: shmem hash area starts at offset 4096 2007-10-16

Well, all normal at open. We know that.

> 20:00:00 enbd-server: nbd-shmem: shmem hash area size 135168 2007-10-16 20:00:00 enbd-server:
> nbd/hash: hash area total size 135168 2007-10-16 20:00:00 enbd-server: nbd/hash: hash buckets 256
> 2007-10-16 20:00:00 enbd-server: nbd/hash: reduce hash area effective size to 134992 2007-10-16
> 20:00:00 enbd-server: nbd/gm: pre-seeding gz heap with unit size 65536 2007-10-16 20:00:00
> enbd-server: nbd/gm: pre-seeding gz heap with unit size 32768 2007-10-16 20:00:00 enbd-server:
> nbd/gm: pre-seeding gz heap with unit size 16384 2007-10-16 20:00:00 enbd-server: nbd/gm:
> pre-seeding gz heap with unit size 8192 2007-10-16 20:00:00 enbd-server: nbd/gm: pre-seeding gz heap
> with unit size 4096 2007-10-16 20:00:00 enbd-server: nbd/gm: pre-seeding gz heap with unit size 2048
> 2007-10-16 20:00:00 enbd-server: nbd/gm: pre-seeding gz heap with unit size 1024 2007-10-16 20:00:00
> enbd-server: nbd/gm: pre-seeding gz heap with unit size 512 2007-10-16 20:00:00 enbd-server: nbd/gm:
> pre-seeding gz heap with unit size 256 2007-10-16 20:00:00 enbd-server: nbd/gm: pre-seeding gz heap
> with unit size 64 2007-10-16 20:00:00 enbd-server: nbd/gm: pre-seeding gz heap with unit size 16
> 2007-10-16 20:00:00 enbd-server: nbd/hash: hash size 4096 header + 130896 data = 134992 2007-10-16
> 20:00:00 enbd-server: nbd/hash: hash entries initial lo/hi limits set at 2198/2443 entries
> 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1) set default signal handlers 2007-10-16
> 20:00:00 enbd-server: enbd-server: server (-1) sent hello ok 2007-10-16 20:00:00 enbd-server:
> enbd-server: server (-1) sent passwd ok 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1)
> got cliserv magic ok 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1) received id device
> 2b00 ok 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1) sent size 858993459200 ok
> 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1) sent sig [gIrCLh] ok 2007-10-16 20:00:00
> enbd-server: enbd-server: server (-1) suggested ro flags 0 ok 2007-10-16 20:00:00 enbd-client:
> enbd-client: client (-1) got size 858993459200 2007-10-16 20:00:00 enbd-client: enbd-client: client
> (-1) got signature [gIrCLh], had [] 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1)
> received blksize 1024 ok 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1) sent/negotiated
> blksize 4096 ok 2007-10-16 20:00:00 enbd-client: enbd-client: client (-1) negotiated blksize 4096
> 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1) received pulse_intvl 10 ok 2007-10-16
> 20:00:00 enbd-server: enbd-server: server (-1) sent/negotiated pulse interval 10 ok 2007-10-16
> 20:00:00 enbd-client: enbd-client: client (-1) negotiated pulse_intvl 10 2007-10-16 20:00:00
> enbd-server: enbd-server: server (-1) agreed 4 channels ok 2007-10-16 20:00:00 enbd-server:
> enbd-server: server (-1) selected free port at 667 2007-10-16 20:00:00 enbd-server: enbd-server:
> server (-1) posted port 667 ok 2007-10-16 20:00:00 enbd-server: enbd-server: server (-1) manager
> started new process group 5901 2007-10-16 20:00:00 enbd-server: enbd-server: server (0) set default
> signal handlers 2007-10-16 20:00:00 enbd-server: enbd-server: server (1) set default signal handlers
> 2007-10-16 20:00:00 enbd-server: enbd-server: server (2) set default signal handlers 2007-10-16
> 20:00:00 enbd-server: enbd-server: server (3) set default signal handlers 2007-10-16 20:00:00
> enbd-server: enbd-server: server (-1) set new signal handlers 2007-10-16 20:00:05 enbd-client:
> enbd-client: client (-1) got session port 667 ok 2007-10-16 20:00:05 enbd-client: enbd-client:
> client (-1) introduction sequence ends ok 2007-10-16 20:00:05 enbd-client: enbd-client: client (-1)
> set sig or passed sigchk OK 2007-10-16 20:00:05 enbd-client: enbd-client: client (-1) set device
> size 858993459200 2007-10-16 20:00:05 enbd-client: enbd-client: client (-1) Warning! changing device

Bit strange.

> blksz from 1024 to 4096 2007-10-16 20:00:05 ENBD #3992[0]: fixup_slot failed to find slot for pid
> 5900 ioctl MY_NBD_SET_SIG arg (user 43724967) in user addr bfb95360
> 2007-10-16 20:00:05 enbd-client: enbd-client: client (-1) sets session slots to 0-3 2007-10-16
> 20:00:05 ENBD #4390[0]: enbd_ioctl cleared show_errs on nda
> 2007-10-16 20:00:05 enbd-client: enbd-client: client (0) opened device /dev/nda 2007-10-16 20:00:05

OK.

> enbd-client: enbd-client: client (0) opened socket (6) to localhost:667 2007-10-16 20:00:05
> enbd-client: enbd-client: client (-1) launched daemon 0 (5906) for localhost:667 2007-10-16 20:00:05
> enbd-client: enbd-client: client (1) opened device /dev/nda 2007-10-16 20:00:05 enbd-client:
> enbd-client: client (1) opened socket (6) to localhost:667 2007-10-16 20:00:05 enbd-client:
> enbd-client: client (-1) launched daemon 1 (5907) for localhost:667 2007-10-16 20:00:05 enbd-server:

Oh no .. we're not going to have to go through FOUR of these?

Please do your testing in a simple way! Use -n 1 and something small.


> 2007-10-16 20:00:05 nda:<4>printk: 4790 messages suppressed.
> 2007-10-16 20:00:05 Buffer I/O error on device nda, logical block 0
> 2007-10-16 20:00:05 Buffer I/O error on device nda, logical block 0

This is a direct error from the bufefr layer.

I would guess that you have compiled a module that is not compatible
with your kernel! Only you know. Is it? Do your source and kernel
settings match? 



> 2007-10-16 20:00:05 enbd-client: enbd-client: client (2) opened device /dev/nda 2007-10-16 20:00:05
> enbd-server: enbd-server: server (2) opened port 667 (socket 8) for client 127.0.0.1 2007-10-16
> 20:00:05 enbd-server: enbd-server: server (2) sent hello ok 2007-10-16 20:00:05 enbd-server:
> enbd-server: server (2) sent passwd ok 2007-10-16 20:00:05 enbd-server: enbd-server: server (2) got
> cliserv magic ok 2007-10-16 20:00:05 enbd-server: enbd-server: server (2) sent sig [gIrCLh] ok
> 2007-10-16 20:00:05 enbd-server: enbd-server: server (2) set new signal handlers 2007-10-16 20:00:05
> enbd-client: enbd-client: client (2) opened socket (6) to localhost:667 2007-10-16 20:00:05
> enbd-client: enbd-client: client (2) read passwd ok from localhost:667 2007-10-16 20:00:05
> enbd-client: enbd-client: client (2) got cliserv magic ok from localhost:667 2007-10-16 20:00:05
> enbd-client: enbd-client: client (2) got a signature ok from localhost:667 2007-10-16 20:00:05
> enbd-client: enbd-client: client (2) set sig or passed sigchk OK 2007-10-16 20:00:05 nda:<6>ENBD
> #2755[2]: enbd_get_device increased socket count on nda to 3

Yo me it looks as though the server is dying. And restarting. Too much
mess to know. Look and tell!

And please test in a SIMPLE way.


> 2007-10-16 20:01:39 Buffer I/O error on device nda, logical block 3
> 2007-10-16 20:01:39 Buffer I/O error on device nda, logical block 4
> 2007-10-16 20:01:39 Buffer I/O error on device nda, logical block 5
> 2007-10-16 20:01:39 Buffer I/O error on device nda, logical block 6
> 2007-10-16 20:01:39 Buffer I/O error on device nda, logical block 7
> 2007-10-16 20:01:39 Buffer I/O error on device nda, logical block 0

At least load the enbd_ioctl module. Run with -n 1. Serve from a small 
4MB file. (i.e. run "make test")! Looks like binary incompatiblity.

Peter



More information about the ENBD mailing list