[ENBD] New to ENBD, having troubles
Christopher Eveland
enbd@lists.community.tummy.com
Tue, 22 Jan 2002 20:10:13 -0500
> You SURE you also tried 2.4.26? If you see the symptoms in BOTH 2.4.26
> and 2.4.27 then that rules out something I've done wrong in 2.4.27.
I downloaded nbd-2.4-current.tgz to start, which seems to be the same as the
nbd-2.4.26 that I just downloaded. When I got it, I made a small patch to
get it to compile with the kernel (checked version for some macro
definition, saw it in the archives, I gather thats a difference between 26
and 26a). Haven't tried 26a yet, but I can try it in the am.
> > Anyway, after doing the make, I do the make test, and go through the
> > checklist. The module is loaded, I can see the server and
> client procs on
>
> Can you run mke2fs at that point?
Nope, hangs.
> > I can even do some small things with the device (I'm using
> /dev/ndf if this
>
> ndf? Hmmm ... I have never tried that. It is quite possible that there
> is a bug with higher numbers of device, simply _because_ I have never
> tried that There was once such a bug, I recall.
OK, I just redid it on /dev/nda and I get identical results.
> > case, since I had set up the others to auto mount on boot,
> obviously getting
> > ahead of myself... anyway, a-e are all turned off), like use dd
> to copy 512
> > bytes onto a device (such as /dev/hda3) and then compare to the
> original 512
> > bytes: they match. But if I try to do something "big", like
> mkfs, it seems
> > to hang up.
>
> mke2fs is unique in that it tries to write _outside_ a partition, in
> order to do a binary search for its limits.
>
> > For instance, after doing "make test", I try "mke2fs /dev/ndf"
> as per the
> > online instructions. As soon as I try mke2fs, I get the
> following on the
> > console:
>
> You are on the server side? And the server dies? I don't understand how
> you can be on the server side ...
I think the issue is that "make test" starts the server over ssh, so I'm
getting both the client and server messages dumped to the same console.
> > interact somewhat with the machine, I can't seem to shut it
> down nicely, to
> > get the load to go back down, I have to pretty much reset the machine.
>
> echo 0 > /proc/nbdinfo. Isn't this prominent enough in the man page?
Apparently not, but sure enough it works. Thanks, and sorry for not reading
carefully enough.
> > So I'm having trouble interpreting this. If anyone has some
> suggestions, or
> > can point me to something to look at, I'd appreciate it. Thanks,
>
> It looks like a straightforward mistake of mine in the 2.4.27 server,
> dying instead of erroring out of range requests. May I ask WHAT you are
> serving? And why is the 2.4.26 server behaving the same way? Are you
> sure it is.
>
> Can you make sure that the device is some small size, like 8MB. If it's
> as straightforward as I believe, then its a simple case of a one line
> change in a user-level routine. What I'm surprised about is the lack of
I'm using the core files that you set up in the make test, seems to come to
8MB.
> diagnostics .... say! Are you using 2.4.26a or 2.4.26? Because 2.4.26a
> shares the driver with 2.4.27. So if both behave the same way, then
> the indication is that it is the driver. And I DO recall removing
> a range check from the driver ...
Seems to have been plain old 26.
> ummm, it would be in the code that takes stuff off the kernel queue ...
> do_nbd_request ... yes. It's still there!
>
>
> PARANOIA_BEGIN;
> if (lo->magic != NBD_DEV_MAGIC) {
> NBD_DEBUG (1, "nd%s is not magical!\n", lo->devnam);
> NBD_FAIL ("nbd[] is not magical!\n");
> }
> if (req->nr_sectors > lo->max_sectors) {
> NBD_FAIL ("oversize request\n");
> }
> PARANOIA_END;
> if (req->sector + req->nr_sectors > lo->sectors) {
> NBD_FAIL ("overrange request\n");
> }
>
> Now I am stumped. Would you mind turning on the paranoia just above
> that range check? Then nothing too big can even get out of the client.
> But it should be just fine as it is, because the kernel guarantees
> not to pass anything bigger than the max_sectors we registered, and in
> any case mke2fs only passes overrange requests, not oversize requests.
> Maybe it's an ext2fs underrun?
I moved the PARANOIA_END line down below the last if, if thats what you
mean. No difference that I could tell. ("rmmod nbd ; make ; make test",
still hangs.)
>
> Anyway, I need a bit more data AND I'll have a look at it tomorrow.
>
> You might try 2.4.26 instead of 2.4.26a (or vice versa). Any difference
> would give me a lead.
OK, I'll try out 26a in the morning too. Thanks a lot for your help.
-Chris