[ENBD] ENBD timeout question...
Peter T. Breuer
ptb@it.uc3m.es
Tue, 31 Oct 2000 22:16:44 +0100 (MET)
"A month of sundays ago Daniel Shane wrote:"
> Actualy, I would like all the nbd-server created by the master server
> (in regards to a specific client) to die after the client stops
> responding for X seconds (which could be the -t option or not...). The
Yes, that's reasonable. (it doesn't "really" solve anything, but it's
good enough for the moment - I mean that one can still flood the
machine by connecting randomly, starting a daemon on the server,
then disconnecting, one just has to do it faster than molasses ...).
> thing is there is no way for me to know if a nbd-server is inactive or
Well, to be honest it's the same sort as the NFS server problem. An NFS
server can't tell when the clients gone away either. That's what you see
with those stale file handle problems (the server reused the handle
because nobody asked for it for a long while). To fix that the solution is
known: a "statd" that when the client machine reboots broadcasts an
"I died" so that everybody still holding onto connections in the hope
of a revival can take note and throw them away.
> not. In anycase, a timeout seems cleaner than a cron script that kills
> all the stale nbd-server processes, if there is a way to detect stale
> nbd-servers.
I need to know whether we're talking about session-slaves
(grandchildren) or session-masters (children of the chief honcho
server). The slaves will continuously time out and die (if they don't
die but retry instead, it's a bug I fixed recently). The session-master
will resuscitate them when they die.
> I'm pretty sure the implementation of this is kind of trivial... what do
> you think?
It is. I'll try and get to it tomorrow. But I would liek some
indication of where to look. I am deep in chasing SMP lockups under
very fast machines with starnge APICs, and it is costing me in time, so
I am angling for more data on just what is going on first. The data I
need is a pstree with indications of which seems to be conencted and
which not (via netstat and fuser?).
> Well... HUP is supposed to restart the process? Altough I dont know if
Possibly. I always thought of it as "reread config".
> its implemented in nbd-server.c. You catch it but I dont see any code
> appart from a propagate(n) and a write, so the renegotiation basicaly
> doesnt happen I thing... (not sure).
I also didn't see what happened. The propagate wil send it to the
children too.
> > presumably for the session port. Can you be more specific about WHICH
> > server one should get rid of? Grandfather, father or child?
>
> Oh! I though we had only two levels, master and slaves...
The grandfather has the control port. When new connects come in, they
talk to that port and negotiate a session port and a number of channels
to run over it. The father opens the session port and launches children
which talk over it on the channels, while the father looks after them,
restarting them if they die.
Peter