[ENBD] Two ENBD issues
Peter T. Breuer
ptb@it.uc3m.es
Wed, 27 Sep 2000 23:35:24 +0200 (MET DST)
"A month of sundays ago Daniel Shane wrote:"
> Hi,
Hi .. good to hear from you. I've been completely whacked by start of
term , department moving into new building, etc.
> Sorry for the delay, I will come back to the mmap problem later, but I
> have two issues right now.
>
> I dont know if you have an idea but, here is what happens.
>
> When a client shutsdown, nbd-server will sometimes use all the available
> CPU until another client reconnects (and sometimes not). Here is what a
> strace -p xxx shows :
> ]
> ....
> select(9, [8], NULL, [8], {360, 0}) = 1 (in [8], left {360, 0})
> read(8, "", 76) = 0
> select(9, [8], NULL, [8], {360, 0}) = 1 (in [8], left {360, 0})
> read(8, "", 76) = 0
Yes, well this is a wait (select) on a single socket for read.
It's also waiting for error. It's a 6 minute timeout, so what is
happening is obviously that it is getting an error return on the socket
immediately. It's followed by a zero read, so that seems likely.
I believe this must be the low level timeout on ALL network reads that I
introduced in order to comabt the lack of a network timeout in linux.
It doesn't help if the resd starts and then blocks, but it helps some.
if (select(self->sock+1,rfds,NULL,efds,timeout) <= 0) {
PERR("Read timed out in readnet: %m\n");
return -ETIME;
}
res = read(self->sock, buf, len);
You should see the error message in the log. This is probably what is
causing high CPU activity. Logging burns cpu.
(The timeout was 3*negotiate_timeout (3*120 = 360), but I reduced it in
2.4.14, or near.)
> 100% CPU usage. I had the same kind of poblem once with broken pipes,
> maybe there was a signal that wasnt caught idicating that the client has
> broken the link. But the read returns 0...
I believe that this is just fast retry, error, log message, retry,
error, log message, ...
What should the cure be? The intention is to stop reads hanging for
ever. But I've inadvertently stopped error from being reported. This
is really a call to readnet and it happens everywhere. I think part
of the problem is that I'm not reporting an error from select.
The select succeeds, but only because the socket is ready to report an
error! So I go on and let the socket report the read error. Hey!
But the read succeeds! It retirns 0, not negative.
OK. This looks like a kernel or library bug, then? The select succeeds
but the read after it returns zero! The select should only succeed
when there are bytes to be read or an error condition. I'm kinda
nonplussed by that. Gah ...
Maybe I should check the descriptor sets. But we're already in country
I don't think we should be in!
I believe the best cure is to change the following error check on the
read result. At the moment a NEGATIVE read result provokes a check to
see if the read asked to be rerun (-EAGAIN) and 10 retries at 1s
intervals before returning timeout (-ETIME). A timeout from the select
cause -ETIME to be returned directly. A different read error is
also reported back (as -ETIME). But zero bytes read just causes a
repeat attempt!
res = read(self->sock, buf, len);
}
if (res < 0) {
if (res != -EAGAIN) {
PERR("Read failed in readnet: %m\n");
...
That should perhaps consider res == 0 an error too. What should one do
if the read returns zero? Perhaps return error? Perhaps try a few times
then return error? I'll vote for returning error immediately. So the
code then becomes:
if (res <= 0) {
switch (res) {
default:
PERR("Read failed in readnet: %m\n");
return res;
break;
case 0:
return -EIO;
break;
case -EAGAIN:
// PTB try 10 times at 1s intervals then fail
if (errs++ < 10) {
microsleep(1000000);
goto read;
}
// PTB give up, fail it and reset count
errs = 0;
PERR("Read eventually timed out in readnet: %m\n");
return -ETIME;
break;
}
}
I can see no place where the return value from readnet is checked for
anything other than < 0. So the change introduced here is just that
a zero read is an error. Can this happen with a slow net? If this
causes problems, then the 0 read case should be treated instead like
the -EAGAIN case.
> Another strange thing is the "rollback" message I get when I am in an
> interactive shell. I will wait for some time and then execute a command
This is interesting. Rollback is never signalled across the net. It is
a clientside action only. It happens when the client decides to give
up. I can see no explicit message from the client either! It's a kernel
message. It must go to a root console, wherever that is.
Rollback occurs (1) when the client unplugs itself from the kernel and
(2) when it thinks something is wrong with the request in the kernel.
In (1) it will have received a network error message or will have lost
the server ack to its heartbeat pulse. In that case it
first asks the kernel to retract the request it had allocated itself
to handle (it will be retracted after 5s untreated anyway) and then
shuts itself down or goes to renegotiate. In (2) it will have received
error returns from its ioctls to the kernel or will have a
corrupt ack from the server on its hands. It's possible that in these
cases it should shut itself down too, but it doesn't.
> which causes a rollback to occur, but the rollback freezes the nbd link.
This will happen if there is only one channel operating. There should
be at least two.
> Like if the client could not resync with the server. Does the rollback
> relauch a nbd-server on the other end? If I never enter interactive
> mode, the rollback never happens, so it must be a timeout issue (it
> happens near the 30s message in nbd-server, but it is sporadic,
> sometimes everything will be fine, sometimes after the 30s I will get a
> rollback on the next block request).
Indeed, this is a timing issue and doesn't sound serious. It's a timeout
during negotiation, it sounds like. If you can show me a trace I can be
more helpful (this is hard, I know).
> I am using one nbd-server on one port to process several clients in ro
That's correct. That's partially why I had to rewrite the server.
Precisely to support that better.
> mode. Since the master nbd-server forks at each requests and uses a new
> port, I dont think that this should cause any problems. Should I have
> one server per client?
No, but there should be multiple channels per server/client. I don't
think you should observe a freeze at rollback then.
> Another strange fact, if I boot several clients at the same time, the
> nbd-server wont serve more than 4 at any one time, so we have to boot
Mmmm .. I don't know any reason why this should be so! It forks a
copy to handle each connection.
> machines by pairs of 3. More and one machine will not sync when the
> nbd-client is executed. Again, maybe that one server per client would
> work, albeit use more memory.
Well, it would, but I placed no limit on the forks, I can assure you!
> I dont know if you have any ideas on these issues, but the most
> important one is the CPU usage when one client breaks off (lets say we
> power off). Sometimes, the nbd-server will still be at 100% even when we
I think that was a logging issue. See if you agree with my analysis. I
donīt think the select shoudl return if the next read will return 0
bytes.
> reboot a station. At other times, a reboot of a station will fix the
> problem.
Peter