[ENBD] Re: NBD 2.2.27
Peter T. Breuer
ptb@it.uc3m.es
Fri, 8 Sep 2000 15:14:46 +0200 (MET DST)
"A month of sundays ago Freaked Personality wrote:"
> OK, i'm not too familiar with how unix uses files exactly. Since it is
> said unix uses everything as a file it might be able to use a partition
> instead of an actual file??
Sure. If you're not sure of this, you should look in a basic text on
unix to pick up the idea! But yes, everything is a file. In particular
a partition supports the ordinary file operations (open close read write
seek).
> So on the webservers we have 2 identical disks, both partitioned the same,
> each has a partition that we call bigdisk. Now bigdisk needs to be
> synchronised with each other, iow if i customer uploads it's page to
> server 1, we want to automatically duplicate it to the other server. We
> tried doing this using a program called intermezzo, which intercepts
> kernel write calls, however this is unstable, unreliable and doesn't take
> permissions into account which is pretty important. So the quesion would
> be can I make nbd copy those partitions instead of making a file on nbd as
Just point the nbd-server at /dev/hda1 (say) instead of /usr/foo.
> the docs say. (Like I said, i'm not to familiar with how linux uses it's
> files exactly so maybe it's possible to hook a partition to the device
> instead of creating a file on it).
> nbd has crashed or for some reason the network communication is lost
> between the 2 servers. whatever happened the 2 servers can't reach
> eachother for nbd usage.
I presume that what you will be doing is real-time RAID mirroring to the
nbd device .. which will pass the read/write stuff over the net via
nbd-client to nbd-server, which is accessing /dev/hda1 (say).
> During that time, customer1 uploads some new homepage files to server1 (1
> will be master) and deletes some old ones. Customer2 logs onto server2 and
> edits homepage files there.
Well, you already have a problem. NBD is a 1-1 connection. You can't
edit the resource on server2 while you are editing the image in
server1. That would be NFS, which NBD is not. What it does is reflect
the writes on server1 over the net to server2.
What you want to do is use server2 as a backup (failover) device.
When server1 goes down you want to fsck server2 and bring it up to
replace server1.
If you want to have both ends of a mirror device editable at the same
time ... well, I don't know of anything that does that except NFS!
Wait .. there is my "yoke" device, which makes two devices into a union
device. Writes are written to all of the devices, and reads are taken
from any one of them.
> OK so the communication between the 2 servers is restored now. What is
> gonna happen?
New writes to the nbd image on server1 will be passed over to server2
again. So you'll get a strange result. Sorry, but you have to
reintegrate the changes made in server2 back to server1 first! You
might as well copy the whole thing back while bringing up server1.
With luck, rsync will send over only chanegd blocks.
Does RAID1 or RAID5 do that for you already? Is there a background
reintegration mode in them? (btw, SuSE already has talked to me about
this .. as has the author of nrbd (nrdb?), which might already do
that). I'll be putting a timemap of the block structure on the server
side sometime next week, with precisely background reintegration in
mind. About 2-3 weeks to full implementation. But I'd like to know if
rsync works first, or if the raid stuff can make it do that on its own.
> customer1 has added some new files. I'm guessing this will not be a
> problem and nbd will mirror the new files to server2. However he has also
> deleted the new files. I guess this will never be solved by any program
Deletes show up as writes at the block level. They're just writes to
the directory list.
> and i would like to see that nbd deleted those files on server2 also but i
> think nbd will see that server2 has files 1 doesn't have and copies the
NBD does not operate at the file level.
> files from 2 to 1.
> Now for the customer 2. His files on server2 are newer, will they be
> copied to 1 or will nbd copy the old files from 1 and overwrite the newer
> (adjusted) ones on 2 because 1 is the master?
>
> Of course it's also possible that nbd does none of the above because there
> was no connection, and doesn't check the differences on the drives, which
> is not a problem but causes us to write a script to detect these kind of
man rsync.
> splits, and stop nbd from restarting when the split is over, then it has
> to sync the disks and then start nbd again. Right? :-)
Well, that would be the situation now, yes! I'll add a timemap and
background reintegration. What has to happen is that the nbd-server
and -client exchange maps at connect and the client requests those
blocks that the server has that are newer than its own (actually, it
might as well get all that have a different md5sum¸ but an md5sum
occupies 64bits, and a timeval only 32).
> I've also send this to the mailing list but nobody has replied yet, and
> I'm guessing you know best about nbd's capabilities :-).
You have sent it? Well, in that case maybe I have somehow not
subscribed! I thought I was moderator!
I'll cc: the list.
Peter