[ENBD] Anyone got swapping over NBD-2.4.25 working?

Peter T. Breuer enbd@lists.community.tummy.com
Fri, 4 Jan 2002 19:32:02 +0100 (MET)


"A month of sundays ago Alan Messer wrote:"
> Jason A. Pattie wrote:
> 
> > I have had similar experiences, except it didn't last as long as you 
> > were able to get it to work.  The explanation I got from PTB was the 
> > fact that each child thread (being in userspace) requires a specific 
> > amount of memory in order to run.  Well, if a child thread is "swapped 
> > out", but it needs to be resurrected to do some work, but there isn't 
> > enough memory to pull it back and if all the other child processes have 
> > been "swapped out", then ... well, you see the quandry.  (It's something 
> > like that)
> 
> 
> Indeed. I suspected this might be the problem. Did you try any of 
> the old patches to overcome this problem? Aren't there any options 
> in Linux to 'pin down' (make unswappable) certain memory regions? 
> Perhaps this would help for critical sections? Of course,  code 
> canbe swapped too, which might be more of a problem.

The 2.4.17pre out there on the ftp site indeed pins the _server_ process
in memory.  You may want the same on the client side.  I suspect that
that will merely shift the problem elsewhere.  It seems to me that
memory management must be fundamentally involved in any scheme to swap
over networking, and that doing it over tcp is just plain impractical.
Perhaps over udp.

Now, you can run enbd over ANY medium. All that needs to be changed
is the single module "stream.c", which provides the methods init,
open, read, write that I expect to need on a streaming medium. If
somebody would like to provide a udp option (it allows for tcp and ssl
over tcp at present), I would be grateful. It should even be simple.

I suspect that udp would not cause any memory deadlocks.

> > You can increase the number of child processes that get spawned, but you 
> > will just be staving off the inevitable.
> 
> Interesting. If I understand correctly, the kernel is quite happy, 
> but the swap action gets stuck. Therefore, other swaps and 
> processes can continue to run? I'll try a few more processes to 

Not necessarily. If you have experience of nfs, for example, you will
know that all sorts of interactions take place and other processes begin
to stick in io too, for varieties of reasons.  You might get sendmail
stuck trying to read a possible .forward off the downed mount, for
example, even though sendmail is running on a different partition.

> see if this is true.

The 2.4.27 server code contains:

           err = negotiate(self);
           if (err >= 0)  {
                 // PTB success
                 self->flags |= F_NEGOTIATED;
                 mlockall(MCL_CURRENT|MCL_FUTURE);  <-***** THIS
                 goto mainloop;
           }

If you want to do the same clientside, you'll need somewhere in
nbd-client.c. Let's see ... have to look for an occurrence of fork() ..
looks like in the "launch" routine is appropriate. Well, I suppose
you could put it just before the place where it starts to actually 
enter its work cycle:

     mainloop:

      // PTB now need to be able to exit from select
      setsighandlers(slavesighandler);
                                                 <--- HERE!
      // PTB do protocol
      err = mainloop(self);


Try it and let me know ...

Peter