[ENBD] 2.4.32
Peter T. Breuer
ptb at it.uc3m.es
Sat Mar 6 14:14:17 MST 2004
"Also sprach Peter T. Breuer:"
> Anyone want to confirm or deny that 2.4.32 is now stable and correct?
Actually I see some problems under kernel 2.6.3 - but they seem to be
userspace, not kernelspace.
In particular the client seems to get sigsegv sometimes after setting
alarm timers to 0 0 (to disable them) with setitimer. And there was a
bug in restting the new microsecond timers that excaberated that. I've
cured the bug and as a result the segfault is now infrequent, but still
there:
seeking and
writing....5%....10%....15%....20%....25%....30%....35%....40%....45%....50%....55%....60%....65%....70%....75%....80%....85%....90%....95%....done
flushing buffers..enbd-client 4499: sighandler relaunches child
from manager
enbd-client 4499: client (-1) reaped dead child 4534
(boom).
Here's the wooonderful strace of the event:
gettimeofday({1078607006, 481520}, NULL) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 29906}}, NULL) = 0
uh, that was me setting an alarm timeout of 30ms.
rt_sigaction(SIGALRM, {0x8052820, [], SA_RESTART|0x4000000}, {0x8052820, [], SA_RESTART|0x4000000}, 8) = 0
fcntl(4, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=524288, len=0}) = ? ERESTARTSYS (To be restarted)
that was the locking operation that was supposed to be guarded by the
30ms timeout. Hey, isn't 30ms a bit small? Maybe that was supposed to
be 30s :). Could be a bug.
--- SIGALRM (Alarm clock) ---
gettimeofday({1078607006, 511593}, NULL) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={55, 302116}}, NULL) = 0
sheer weirdism - the alarm went off and apparently we reset the timer
to 55s, probably for an enclosing timer loop.
rt_sigaction(SIGALRM, {0x8052820, [], SA_RESTART|0x4000000}, {0x8052820, [], SA_RESTART|0x4000000}, 8) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
well, here we turned the timer off, because we did whatever we were
doing.
rt_sigaction(SIGALRM, {SIG_IGN}, {0x8052820, [], SA_RESTART|0x4000000}, 8) = 0
and we indeed say to ignore the alarm in case it ever goes off.
--- SIGSEGV (Segmentation fault) ---
and boom. Dunno where.
I'm working on it. Only under kernel 2.6, which appears to behave
fundamentally different in some aspects that affect userspace. If
only I knew what they were ...
Just thought I'd let you know that I am now testing under 2.6.3. Why
won't my dns server run under it?
socket(PF_INET6, SOCK_DGRAM, 0) = 5
bind(5, {sin_family=AF_INET6, sin6_port=htons(53), inet_pton(AF_INET6, "fe80::220:e0ff:fe8f:1c7", &sin6_addr), sin6_flowinfo=htonl(0)}}, 28) = -1 ENODEV (No such device)
write(2, "dnsmasq: bind failed: No such de"..., 37dnsmasq: bind failed: No such device) = 37
_exit(1) = ?
but ipv6 is in the kernel!
Peter
More information about the ENBD
mailing list