[ENBD] 2.4.32 conclusions
Peter T. Breuer
ptb at it.uc3m.es
Thu Mar 11 11:37:44 MST 2004
"Also sprach Anders Blomdell:"
> > via a second machine, but one can't. The only way out is not to use
> > buffers. I.e.
> > open the resource O_DIRECT or use a raw device as the resource.
> I did get loackups with O_DIRECT (if that what -n on server does)
It does. That is very hard to believe. But assuming it is so, can you
go into the enbd-client.c code, and comment out the groups of lines which
go lock.down_... and lock.up_.... . there should be two lock.downs and their
corresponding lock.ups. You should end p with this:
+ /*
if (lock.down_write_timeout(&lock, 1000 * self->data_timeout) < 0) {
err = -ETIME;
PERR("failed to get write lock on %Ld-%Ld, timeout\n", req->from,
req->from + req->len);
goto create_reply;
} else {
undo_lock = 1;
}
+ */
...
create_reply:
+ /*
if (undo_lock) {
lock.up_write(&lock);
}
+ */
...
init_rwlock(&lock, self->dev, req->from, req->len);
+ /*
if (lock.down_read_timeout(&lock, 1000 * self->data_timeout) < 0) {
err = -ETIME;
PERR("failed to get read lock on %Ld-%Ld, timeout\n", req->from,
req->from + req->len);
goto create_reply;
} else {
undo_lock = 1;
}
+ */
DEBUG ("server_read reads for request len %d\n", req->len);
err = server->read (server, buf, req->len, req->from);
create_reply:
+ /*
if (undo_lock) {
lock.up_read(&lock);
}
+ */
The reason for my saying so is that on my test platform under 2.6.3 I am
seeing evidence that fcntl locks fail at the 2GB barrier. If the
clients cannot get the lock in order to do their work, they will not do
their work.
> > Maybe you don't understand "raw device"? That's one of the /dev/rawX.
> Didn't try it.
Arne at least says that that works. It's cumbersome, though.
> > You bind those devices to existing block devices using the raw device
> > utilities,
> > then access the raw devices instead of the block device.
> >
> > It is normal to have two servers each writing a mirror to the other server.
> > This does not hurt when both services are under light write pressure, but under
> > heavy pressure one must use O_DIRECT or raw resources in order to avoid
> > VMS cross deadlock. If VMS is not acting, it cannot deadlock.
> Never mind, load seems to be lower when not cross-writing (i.e master
> writes to
> 2 mirrors). 6 parallell mkfs.ext on mirrors (i.e. 12 /dev/nbX) gives 50%
> load on master
> and 10% on mirrors, as opposed to 100% load on all 3 machines when
> cross-writing.
What does top say is the process doing the work, and what does strace
say it is doing? :-).
Peter
More information about the ENBD
mailing list