Q. about v. 2.4.15; was: Re: [ENBD] nbd with an SMP kernel?
Leonid Andreev
leonid@latte.harvard.edu
Mon, 13 Nov 2000 11:15:41 -0500 (EST)
On Thu, 9 Nov 2000, Peter T. Breuer wrote:
> "A month of sundays ago Peter T. Breuer wrote:"
> > "A month of sundays ago Leonid Andreev wrote:"
> > > Well, sending a SIGPWR requires you to be around to send it (or to
> >
> > Mmmm. There is a parameter NBD_MAX_LIVES that controls how many times
> > ...
>
> I implemented the changes for timeout in 2.4.15. I'm closing 2.4.15 now
> so plese check. You may have to reduce NBD_MAX_LIVES to 1 or 2 to
Hi,
I didn't have any time to test it last week, so I only got to it
last night (so this is the finally-finally closed 2.4.15 I'm experimenting
with). So far I couldn't reproduce the auto-restart as described above.
I compiled the package with NBD_MAX_LIVES=0. Here's how I test:
server:
nice -19 nbd-server 4017 /tmp/file -i "NBDabcdefNBD" -t 120 -b 1024
client:
insmod -f nbd.o rahead=20 merge_requests=0 sync_intvl=1
nice -19 nbd-client localhost 4017 localhost localhost -b 1024 -t 120 -d
1 /dev/nda
then mount the device, make sure it's working and then kill the server,
restart it and see what happens. It doesn't seem to be able to reconnect
on its own after 5 minutes. Also, when I compile with the default
NBD_MAX_LIVES=30, sending SIGPWR to the client in this situation doesn't
seem to produce a successfull reconnect either. I'll try again today and
will send you more error messages/statistics if I don't get any luck.
On the other hand, my 2.4.14 installation has been running in an almost
production environmnent (one SMP box serves 8G partition to another SMP
box that uses it in a raid1 setup and runs a moderately busy mail on it)
for a week+ w/out any problems (knock on wood).
-L.
> see the effect. The client times out on the server and restarts
> if it has to do more than NBD_MAX_LIVES restarts inside 5 minutes
> on each connection it has open. It times out the individual
> connections, and then when the last one goes, it restarts.
>
> 2.4.15 = 2.4.14 + correct non-threadsafe hashentry count in
> caching daemons (metadata problem with more than one daemon
> leading to silent corruption on write with -jh option).
> Added r/w locks in hash code (possibly safe). Added ondisk
> order links to hash entries to allow sequences of reads and
> writes without hashtable searches between each sector (works).
> Correct SIGUSR1/2 entries in manpage. Added free hash entry list
> to allow deallocations after fails in checks. Do full restart
> of client after SIGPWR to enable reconnect to restarted server
> (add manpage entry). Incorporate bugfix in server for
> getsize on resiserfs partition by Wang Gang. Add negotiation
> on pulse interval by Wang Gang. Add nested alarm timeouts.
> Add raid mode (show_errs) by Wang Gang. Mended usage printout
> in client. Add timeout to restart as though from SIGPWR when
> all client slaves die repeatedly and quickly, as though failing
> against a dead server
>
> ftp://oboe.it.uc3m.es/pub/Programs/nbd-2.4.15.tgz
>
> Peter
> _______________________________________________
> ENBD mailing list
> ENBD@lists.community.tummy.com
> http://lists.community.tummy.com/mailman/listinfo/enbd
>