[ENBD] 2.4.30 problems with big file transfers (1 GB)

Peter T. Breuer enbd@lists.community.tummy.com
Fri, 21 Feb 2003 15:26:34 +0100 (MET)


"A month of sundays ago ckoepp wrote:"
> OK - I tried the Option -w 0 on 2.4.30 and then everything works fine. I'm
> little bit confused now, since the enbd-server manpage says "0" is the

So am I. I'll make sure the default is set to zero.

> default value, so it seeems to be not true? Further I do understand from the
> manpage that on using a journaling filesystem ( I use ext3 for the moment)
> the value should be greater "0". So what is now recommendable to use? BTW

Well, it should be greater than zero to preserve write order. This
only matters in case of having to actually /make use of/ the
journalling capabilities. I.e. if your disk crashes. Then you need to
have written the (meta)data to journal before writing it to disk!

Actually .. if you keep the journal on a local filesystem instead of on
the remote device, it should be almost irrelevant. Almost. Umm, well,
less relevant. Because at least the writes to the journal will be
ordered.

The question is really why you are getting sufficient congestion to
cause the server to have to block on some channels for a while in order
to receive a late request in the right order. I presume that is what is
happening.

I would need to see the relevant log entries and output from
/proc/nbdinfo to form an opinion. You should of course see zero
reordering (blocking) if you only use a single channel (-n 1)
but that is not optimal for speed either. If you use two channels
then this should hardly ever cause reordering. With four channels
there will be more blocking, but it cannot never be slower overall than
one.

Really I would like to see the logs.

I would guess that your problem is having the journal on the remote on
the same partition. Even on a local disk this leads to "i/o storms".
I would imagine that such storms could fill all channels and lead
to some writes arriving out of order, causing those channels to block
until the right request arrives on another channel first.

Try plaing the journal on another partiton, preferably local.

> when I run diff (which delivers no errors) and later fsck, the file and
> filesystem, respectively, is not corrupted in both cases (with or without -w
> 0).

It wouldn't be, unless you got a crash! Anyway, you are probably seeing
the locally cached image, not the truth.

> One last question... I saw that you have had a discussion with another
> mailinglist member about "Unresolved Symbols" and I run into the same
> problems. I wasn't  able to find a solution on your discussion, so any
> updates RE: Solutions on this problem?

I don't recall for the moment.  Can you remind me of the situation?
Obviously you have loaded the driver or nothing would be happening!

Peter