[ENBD] ENBD on 2.4.x
Peter T. Breuer
ptb@it.uc3m.es
Thu, 15 Mar 2001 12:22:15 +0100 (MET)
"A month of sundays ago Jon Arney wrote:"
> I've spent some time with ENBD and the 2.4 kernel and have a couple
> of observations (for what it's worth).
>
> Under 2.4.0-prerelease, I am able to create a network block device
> and connect it to client and server and perform many millions of
> read/write
> requests to it by opening '/dev/nda'. No problems there. I can also
> run
> 'mke2fs /dev/nda' and it seems to create a good filesystem. When I
> mount
> that filesystem and run 'bonnie' under it, it blocks indefinitely during
> the 'Writing Intelligently' phase. '/proc/nbdinfo' shows me the
> following:
Are you mounting the fs sync?
> Device a: Open
> [a] State: verify, rw, enabled, last error 0
> [a] Queued: +0R/0W curr reqs =0R/0W real reqs +1R/128W max reqs
> [a] Buffersize: 86016 (sectors=168)
> [a] Blocksize: 1024 (log=10)
> [a] Size: 32768KB
> [a] Blocks: 32768
> [a] Sockets: 1 (*)
> [a] Requested: 218000 (218000)
> [a] Despatched: 217488 (217488)
> [a] Errored: 0 (0) 0+0
> [a] Pending: 0 (0) 0R/0W+0R/512W
> [a] Kthreads: 0 (0 waiting/0 running/1 max)
> [a] Cthreads: 1 (+)
> [a] Cpids: 1 (30584)
> Device b-p: Closed
>
> Note that under 'pending', there are 512 requested writes which have
> not been dispatched. I have not yet traced the cause of this.
Probably vfs is full. File bigger than memory, etc. Is the fs mounted
sync? If not, mount it sync. If that doesn't help, load the module with
sync_intvl=1. That will flush VFS frequently, before it can ever fill
up (well, small ram, fast cpu will still let it block, but we hope
not).
> Also, under 2.4.1 and above (recently available), the read/write
I haven't tried it.
> requests directly
> to the device '/dev/nda' block indefinitely and 'mke2fs' blocks
> indefinitely
Are there any more reports of this? If it is unique to 2.4.1 then
clearly changes in 2.4.1 are implicated, but the change you made
removes the safety brake applied by merge_requests=0.
Can you make sure that merge_requests=0 is set in the module? I don't
mind taking the defaults away, but they were intended to make things
safer by preventing the kernel from making any unwarranted assumptions,
and I'd kind of like to know what is going on!
> (not allowing me to continue with the mount and bonnie test). I believe
> that I
> have discovered the cause of this problem. It seems that in 2.4.1, some
> changes to the
> 'elevator' optimizations were made and this code seems to be in a state
> of flux
> for the moment.
> May I suggest that for the moment, we consider adding a line here at
> approximately
> line 3490 of 'nbd.c' in the Kernel driver:
>
> #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,3,30)
> // JSA - Add this line because under 2.4.1 and above, the merge
> optimizations are in flux
> #if LINUX_VERSION_CODE < KERNEL_VERSION(2.4.1)
Hmmm ... you mean don't try making things safer!
> // PTB control merge attempts so we don't overflow our buffers
> ll_merge_requests_fn =
> (BLK_DEFAULT_QUEUE(MAJOR_NR))->merge_requests_fn;
> ll_front_merge_fn =
> (BLK_DEFAULT_QUEUE(MAJOR_NR))->front_merge_fn;
> ll_back_merge_fn = (BLK_DEFAULT_QUEUE(MAJOR_NR))->back_merge_fn;
> (BLK_DEFAULT_QUEUE(MAJOR_NR))->merge_requests_fn =
> &nbd_merge_requests_fn;
> (BLK_DEFAULT_QUEUE(MAJOR_NR))->front_merge_fn =
> &nbd_front_merge_fn;
> (BLK_DEFAULT_QUEUE(MAJOR_NR))->back_merge_fn =
> &nbd_back_merge_fn;
You can check, but you'll see that those functions do nothing if
merge-requests=0 is set. But clearly, if things are in flux there, I
need to do nothing until I know better. So I agree with this change.
> #endif
> #endif
>
> Finally, (not to be picky), but 'Dispatch' is spelled with an 'i', not
> an 'e' :)
It can be spelled both ways. Check your dict :-) (not that I have one
here, but I know this. I've tried it both ways at various times in
the evolution of the proc code to see which one I like better. I was
mentioned in despatches!).
Peter