[ENBD] 2.4.32
Peter T. Breuer
ptb at it.uc3m.es
Fri Mar 5 13:07:03 MST 2004
"Also sprach Anders Blomdell:"
> >> I'll give explicit blocksize, remove '-e' and give '-p 60'
> >
> > Try that. I will also stop -e having an effect unless ALL daemons are
> > dead.
>
> Now two of the servers are mostly hung (can't logon, but answers to ping
They aren't hung. Whatever happens inside enbd does not affect the rest
of the kernel. When requests are blocked then i/o _to that device_
ceases, but that is all. If you find that you can't logon, then your
home is on the remote devices! And you are seeing precisely what you
ought to see.
You can clear the blocked requests by logging on as root (whose home
will be local) and echoing 0 to /proc/nbdinfo. Or you can set a
special login which does exactly that as a shell!
But they will timeout in 60s if you set -p 60.
> and echoes linefeeds at console). This is what the remaining server says:
>
> enbd-client 2071: client (-1) manager launched daemon 2 (2171) for server-01:30051
> enbd-client 2171: client (2) opens device /dev/ndc3
OK. Is this ndc? It looks like it.
> enbd-client 2071: client (-1) childminder launched pid 2171 (2)
> enbd-client 2170: client (1) opened socket 5 to server-01:30051
> enbd-client 2171: client (2) opened device /dev/ndc3 ok
> enbd-client 2071: client (-1) manager launched daemon 3 (2172) for server-01:30051
All fine, I suppose.
> enbd-client 2172: client (3) opens device /dev/ndc4
> enbd-client 2071: client (-1) childminder launched pid 2172 (3)
> enbd-client 2172: client (3) opened device /dev/ndc4 ok
OK.
> enbd-client 2167: client (3) opened socket 5 to server-02:30021
> enbd-client 2171: client (2) opened socket 5 to server-01:30051
> enbd-client 2172: client (3) opened socket 5 to server-01:30051
> enbd-client 2165: client (2) opened socket 5 to server-02:30021
> enbd-client 2067: sighandler relaunches child from manager
> enbd-client 2067: client (-1) reaped dead child 2157
Well, here we have a different client altogether! This is none of the
ones mentioned above.
> enbd-client 2067: client (-1) manager launched daemon 1 (2174) for server-01:30021
> enbd-client 2067: client (-1) childminder launched pid 2174 (1)
> enbd-client 2174: client (1) opens device /dev/nda2
Aha. It's for nda, not ndc.
> enbd-client 2174: client (1) opened device /dev/nda2 ok
> enbd-client 2174: client (1) opened socket 5 to server-01:30021
> enbd-client 2067: sighandler relaunches child from manager
> enbd-client 2067: client (-1) reaped dead child 2158
> enbd-client 2067: client (-1) manager launched daemon 2 (2175) for server-01:30021
> enbd-client 2067: client (-1) childminder launched pid 2175 (2)
> enbd-client 2175: client (2) opens device /dev/nda3
> enbd-client 2175: client (2) opened device /dev/nda3 ok
> enbd-client 2175: client (2) opened socket 5 to server-01:30021
> enbd-client 2073: <# 298> managersighandler received signal 17
All looks OK, but I really can't tell which is which! 17 is SIGCHLD
so a child died.
> enbd-client 2073: sighandler relaunches child from manager
> enbd-client 2073: client (-1) reaped dead child 2160
and surely we have not seen that one before? How many of tehse things
are there?
> enbd-client 2073: client (-1) reaped dead child 2159
> enbd-client 2073: client (-1) manager launched daemon 1 (2176) for
> server-02:30051
> enbd-client 2073: client (-1) childminder launched pid 2176 (1)
> enbd-client 2073: client (-1) manager launched daemon 3 (2177) for
> server-02:30051
> enbd-client 2073: client (-1) childminder launched pid 2177 (3)
> enbd-client 2176: client (1) opens device /dev/ndd2
> enbd-client 2177: client (3) opens device /dev/ndd4
> enbd-client 2176: client (1) opened device /dev/ndd2 ok
> enbd-client 2177: client (3) opened device /dev/ndd4 ok
> enbd-client 2176: client (1) opened socket 5 to server-02:30051
> enbd-client 2177: client (3) opened socket 5 to server-02:30051
This all looks fine.
> Would tcpdumps be of any help?
No - the above all looks fine, apart from the fact that some daemons
keep dying, which strongly suggests that the other end is not
responding. What did you do exactly? What does the other end say?
And I am also puzzled about this whole setup. Would you mind
describing it? I have something of the feeling that there is aloop
here.
Peter
More information about the ENBD
mailing list