[ENBD] raid1 mirroring inside enbd

Peter T. Breuer enbd@lists.community.tummy.com
Fri, 3 Jan 2003 21:44:13 +0100 (MET)


"Peter T. Breuer wrote:"
> > >   enbd-client piano:1044 -n 2 +xilophone:2022 -n 2  /dev/ndb

To keep you informed of the state ... well, it's hard to know about the
stability. I daresay there ARE lots of bugs in the RAID1 support still.
The question is how critical are they. But if I say "there are lots of
bugs", nobody will try it and then nobody will find the bugs. I'm
trying it - I'm sitting atop an nbd raid partition, and it works.

But I don't go pulling the network cables out just to see if it will
do the right thing when it happens, always.

For example, I was thinking of releasing it, when a few days ago I
found that on my SMP testbed, the kernel bread() just stops dead and
loads cpu to 100%.  bread() reads the device data, preparatory to doing
a network write in the RAID1 resync.  Some hours later the machine runs
out of memory.  The same /binary/ works fine on my portable, and no
problems.

The kernel list suggests that getblk() may spin forever because
find_or_create_page() never returns pages of a satisfactory size from
VMS, for some reason, possibly because set_blocksize (a function I'd
never heard of till now) was not called at device setup.  But it's not
called on the portable either, and everything works fine there.  Adding
a set_blocksize() call as the block size is set does not cure the
condition either.  Maybe it has to be called on every subdevice too!

Today I gave up on the kernel bread() and wrote my own. It seems to
work fine in all circumstances. The only problem is it doesn't read
from the VMS - it always calls out to the net to do the read, even
if the answer is cached. Well, that's probably not a big loss in
practice.

I worried about the race conditions for requests that are "rolled
back".  What happens if the device group disappears while they're
sitting in the bottle?  And particularly how do the fake requests that I
use to do the resync and remote ioctls mix in with real requests?  The
end_io functions, with luck, ought not to try and put fake requests back
on the kernel queues, and so on, but I haven't made a definitive study
of the possibilities.  I /think/ it's all OK, because I was careful to
write and embed distinct completion and end_io functions in requests and
buffer heads respectively.  And I zeroed all fields I didn't use in the
fake requests, which the kernel code canonically interprets as "don't
touch - I'm not yours".

So, in summary, I am sure there are bugs, but I don't know if they're
bad or merely nuisances. When I find one, I eliminate it. Things
are working more or less here.

When things have been stable for some time, I'll pull the raid
code out and put it into a separate device.


Peter