[ENBD] Re: ENBD on Debian 2.6.7/8 kernel

P.T. Breuer ptb at it.uc3m.es
Thu Oct 14 08:23:56 MDT 2004


In article <416E890E.1000803 at ociweb.com> you wrote:
> Not to be annoying, but could you test it across the network (i.e.,
> physical network interfaces, not localhost)?  I know there shouldn't

OK.

> be any differences, but I have 3 boxes all installed with the exact
> same version of Debian testing/sarge.  I put the exact same
> executables on all 3 systems.  And I get the ALRM error messages when
> attempting to talk amongst the systems.

I also am noticing sporadic ALRMs (and nonsporadic ones, after I revive
my laptop from hibernation on the train journey :-). They wouldn't be a
problem, but the 2.6 driver still has that annoying bug that clients
won't die properly until you echo a 0 to /proc/nbdinfo, so after
receiving the sigalrm, they don't die properly until I put them out of
their misery.  Then they get reborn properly, like they should. It
seems to be some 2.6 kernel thing that I'll now make a real effort
totrack down, since I am now sitting at the console of a 2.6.8.1
machine ...

I'm prepared to believe that sporadic timeouts would have the same
effect! But what causes them?

Anyway, obviously you can increase the timeouts to silly numbers to
avoid it happening. -p something, I think I recall. But perhaps 2.6
really does just produce strange timings? What is your platform? Is it
SMP? That's famous for timing booboos in the kernel code.

> However, the funny thing is that the enbd-maketest works perfectly
> fine.  I get successful status results.  But, the moment I try to
> communicate across the network, it gives those ALRM errors and doesn't
> work and the enbd-client processes terminate and the /proc/nbdinfo
> interface says that all devices are closed.

Well, as you know, the network is just the network. Precise addresses
should not matter. Whether it's the same machien or a different one
should make no difference. 

I'll try it and see!

Peter


More information about the ENBD mailing list