[ENBD] 2.4.32 more weirdness

Peter T. Breuer ptb at it.uc3m.es
Thu Mar 11 15:36:18 MST 2004


"Also sprach Anders Blomdell:"
> > You compile enbd userland against 2.4 code (i.e.  gcc headers,
> > LINUXDIR=/usr).  You apply the enbd patch for 2.6.3 and compile the
> > kernel module from within the kernel source hierarchy.
> Thats what I have done...

OK.

> > I think I put some kind of message to that effect in an echo in the 
> > Makefile.
> Missed that.

 module:
         test $(KERNELMAJOR) = 2.6 && { echo please apply kernel patch instead; exit 0; }; \
         ...


> Tried raw devices, no hangs so far, but consistent 20% load during raid resync
> (to be expected, thare are 6 resyncs active instead of 2), but this is 
> weird:

I'd still like to know what is taking the cpu. Is it really that
userland select() is a busy wait in 2.6? Can anyone confirm?

> 
> [root at newsperry-03 root]# date ; grep now /proc/nbdinfo
> tor mar 11 21:38:37 CET 2004
> [a] B/s now:    0       (0R+0W)
> [b] B/s now:    0       (0R+0W)
> [c] B/s now:    0       (0R+0W)
> [d] B/s now:    0       (0R+0W)

> [root at newsperry-03 root]# date ; grep now /proc/nbdinfo
> tor mar 11 21:38:54 CET 2004
> [a] B/s now:    1.86M   (0R+1.86MW)
> [b] B/s now:    1.86M   (0R+1.86MW)
> [c] B/s now:    3.22M   (0R+3.22MW)
> [d] B/s now:    3.22M   (0R+3.22MW)
> [root at newsperry-03 root]# date ; grep now /proc/nbdinfo
> tor mar 11 21:39:03 CET 2004
> [a] B/s now:    11.0K   (0R+11.0KW)
> [b] B/s now:    11.0K   (0R+11.0KW)
> [c] B/s now:    95.0K   (0R+95.0KW)
> [d] B/s now:    95.0K   (0R+95.0KW)
> [root at newsperry-03 root]# date ; grep now /proc/nbdinfo
> tor mar 11 21:39:10 CET 2004
> [a] B/s now:    0       (0R+0W)
> [b] B/s now:    0       (0R+0W)
> [c] B/s now:    1.00K   (0R+1.00KW)
> [d] B/s now:    1.00K   (0R+1.00KW)
> 
> Resyncs seems to oscillate (as does true writes), interesting behaviour
> really, but not what I want on my servers...

Oscillate is OK in any case - they are being throttled and not getting
good feedback, so you would expect overthrow and underthrow. It's
typical of a control system without first-derivative (i.e.
capacitative) feedback in the loop. I suspect that it's VMS again.
There was a huge chop out of old code at one point in VMS which ended 
in a much simplified system that probably is not even first order
predictive. 

You might try taking the throttle off totally. (see
/proc/sys/raid...). And please move to Paul Clement's bitmap patch
for 2.6 instead of using kernel raid1.

  http://parisc-linux.org/~jejb/md_bitmap


> So as said before, I'll stick to 1 server and 2 replicas.

I don't see anything wrong in the above, and oscillation would also
come from the file system flushes resonating a bit. But whatever you
prefer.

Peter


More information about the ENBD mailing list