[ENBD] 20G dd hang up!
Kuniyasu SUZAKI
enbd@lists.community.tummy.com
Wed, 09 Jan 2002 23:41:31 +0900
Dear,
>>From: "Peter T. Breuer" <ptb@it.uc3m.es>
>>Subject: Re: [ENBD] 20G dd hang up!
>>
>>After maketest succeeds in localhost to localhost configuration, you
>>must then get it to succeed in local to remote configuration. I am 99%
>>certain that you have it working in that configuration and that you are
>>reporting long-term, not short-term, errors, but I must be sure. So
>>please let me know.
I see. I attach the LOG file of "make test" which is done with remote
machine. I used ENBD 2.4.26 and it was success.
Server machine: Dell Precision 220
Pentium III 933Mhz
Memory 512M
100M Ether (NIC 3Com 3c905C )
RedHat 6.2J
Kernel 2.2.18
Client Machine: IBM ThinkPAD X20
Pentium III 600Mhz
Memory 320M
100M Ether
RedHat 6.2J
Kernel 2.2.18
DISK 20G
>>"hung up" is also not a precisely defined term! Do you mean the network
>>became sluggish and tcp began to time out? Or do you mean that the
>>client machine locked solid - i.e. deadlocked or became stuck in some
>>internal kernel loop?
"dd" worked about 10-20 seconds. At the time the hard disk of the
server machine also drove. After that the console of client machine
was frozen. I could not use keyboard and console. The "ping" form
other machine couldn't get any answer.
The detail of the situation was as following.
On the nbd-sever
#/usr/local/sbin/nbd-server 5058 /dev/sda1 -t 120 -b 1024 -0
On nbd-client
# ./nbd-client macineA:5058 -n 4 -b 1024 -t 120 /dev/nda
# dd if=/dev/zero of=/dev/nda bs=1048576 count=19539
The nbd-client and nbd-server told the following messages when the
client machine was frozen.
On the nbd-client
NBD #968[0]: nbd_rollback rollback req c0272260 from slot 0!
NBD #968[1]: nbd_rollback rollback req c02727d8 from slot 1!
NBD #968[2]: nbd_rollback rollback req c0272378 from slot 2!
NBD #968[3]: nbd_rollback rollback req c0272308 from slot 3!
On the nbd-sever
nbd-server: mainloop [RANGE! (+14858796828755968)]
nbd-server: mainloop [RANGE! (+17110596642441216)]
nbd-server: mainloop [RANGE! (+15984696735598592)]
nbd-server: mainloop [RANGE! (+40799591842744320)]
nbd-server: server (-1) relaunches child after SIGCHLD
nbd-server: server (-1) main childminder checking pid 1123
nbd-server: server (-1) main childminder checking pid 1121
nbd-server: server (1) set default signal handlers for slave server 1128
nbd-server: server (-1) main childminder checking pid 1122
nbd-server: server (-1) main childminder checking pid 1120
nbd-server: server (-1) main childminder checking pid 1123
nbd-server: server (-1) main childminder checking pid 1128
nbd-server: server (-1) main childminder checking pid 1122
nbd-server: server (2) set default signal handlers for slave server 1129
nbd-server: server (-1) main childminder checking pid 1120
nbd-server: server (-1) main childminder checking pid 1123
nbd-server: server (0) set default signal handlers for slave server 1130
nbd-server: server (-1) main childminder checking pid 1128
nbd-server: server (-1) main childminder checking pid 1129
nbd-server: server (-1) main childminder checking pid 1120
nbd-server: server (-1) main childminder checking pid 1130
nbd-server: server (-1) main childminder checking pid 1128
nbd-server: server (-1) main childminder checking pid 1129
nbd-server: server (-1) main childminder checking pid 1120
nbd-server: server (3) set default signal handlers for slave server 1131
nbd-server: server (-1) relaunches child after SIGCHLD
nbd-server: server (-1) main childminder checking pid 1130
nbd-server: server (-1) main childminder checking pid 1128
nbd-server: server (-1) main childminder checking pid 1129
nbd-server: server (-1) main childminder checking pid 1131
What should I do?
Are there anybody who can transfer the Giga byte data with "dd"
command via ENBD?
Kuniyasu SUZAKI, National Institute of Advanced Industrial Science and Technology,
Tsukuba Central 2, Umezono 1-1-1, Tsukuba, Ibaraki 305-8568, JAPAN
Project NTC (Network Transferable Computer) http://www.etl.go.jp/~suzaki/English/NTC
----------------- log of "make test" ----------------------------
machineA% make test
server:machineA <- -> client:machineB
echo; echo; \
stty -echo </dev/tty ; \
sh -c "sudo echo kill nbd-server" ; \
stty echo </dev/tty ; \
sh -c "sudo killall nbd-server; sleep 1; sudo killall -9 nbd-server"
Password:
kill nbd-server
nbd-server: no process killed
nbd-server: no process killed
make: [kill-server] Error 1 (ignored)
echo; echo; \
stty -echo </dev/tty ; \
sh -c "sudo echo nbd-server" ; \
stty echo </dev/tty ; \
rsync -uav --rsh=ssh /tmp/nbd-server /tmp/ ; \
for i in /tmp/core0 /tmp/core1; do sh -c "test -s $i || dd if=/dev/zero bs=4096 count=4096 >$i" ; done ; \
sh -c "sudo nice -19 /tmp/nbd-server 5058 /tmp/core0 /tmp/core1 -i "NBDabcdefNBD" -t 120 -b 4096 -0 ; pstree -p | grep nbd-server; sleep 300" &
nbd-server
building file list ... done
wrote 73 bytes read 16 bytes 178.00 bytes/sec
total size is 135848 speedup is 1526.38
delay 5s .. |-nbd-server(6213)
nbd-server: server (-2) locked /var/state/nbd/server-NBDabcdefNBD.client_ips
nbd-server: server (-2) pinged service nbd-cstatd at 127.0.0.1:5051
nbd-server: with news "notice server-start 5058 127.0.0.1
quit
"
nbd-server: main server (-2) failed connect to 11.11.11.11 on port 5051
nbd-server: main server (-2) failed connect to 10.10.10.10 on port 5051
nbd-server: server (-2) unlocked /var/state/nbd/server-NBDabcdefNBD.client_ips
nbd-server: server (-2) set new signal handlers for master server 6213
nbd-server: connectme notice: setsockopt RCVTIMEO failed with Protocol not available
.
echo; echo; \
stty -echo </dev/tty ; \
ssh machineB "sudo echo kill nbd-client" ; \
stty echo </dev/tty ; \
ssh -n machineB "sudo killall -USR1 nbd-client ; sleep 1; sudo killall -9 nbd-client; sleep 4; sudo /sbin/rmmod nbd"
ntc@machineB's password:
Password:
Sorry, try again.
Password:
Sorry, try again.
Password:
kill nbd-client
ntc@machineB's password:
nbd-client: no process killed
nbd-client: no process killed
rmmod: module nbd is not loaded
make: [kill-client] Error 1 (ignored)
echo; echo; \
stty -echo </dev/tty ; \
ssh machineB "sudo echo nbd-client" ; \
stty echo </dev/tty ; \
copy(){ rsync -uav --rsh=ssh $1 machineB:$2 ; } ; copy /tmp/nbd.o /tmp/ ; \
copy(){ rsync -uav --rsh=ssh $1 machineB:$2 ; } ; copy /tmp/nbd-client /tmp/ ; \
copy(){ rsync -uav --rsh=ssh $1 machineB:$2 ; } ; copy /home/ntc/work/nbd/nbd-2.4.26/MAKEDEV /tmp/ ; \
ssh -n machineB "cd /dev; sudo /tmp/MAKEDEV /dev/nda" ;
ntc@machineB's password:
nbd-client
ntc@machineB's password:
building file list ... done
nbd.o
wrote 42032 bytes read 32 bytes 12018.29 bytes/sec
total size is 41920 speedup is 1.00
ntc@machineB's password:
building file list ... done
nbd-client
wrote 123018 bytes read 32 bytes 22372.73 bytes/sec
total size is 122893 speedup is 1.00
ntc@machineB's password:
building file list ... done
MAKEDEV
wrote 1283 bytes read 32 bytes 526.00 bytes/sec
total size is 1173 speedup is 0.89
ntc@machineB's password:
ssh -n machineB "sudo /sbin/insmod /tmp/nbd.o rahead=20 merge_requests=0 sync_intvl=0" ; \
ssh -n machineB "sudo nice -19 /tmp/nbd-client machineA:5058 -n 4 -b 4096 -i "NBDabcdefNBD" -t 120 -p 5 -d 1 /dev/nda ; pstree -p | grep nbd-client; sleep 90"
ntc@machineB's password:
ntc@machineB's password:
nbd-client: client (-1) manager opened NBD device /dev/nda (2b00)
nbd-client: client (-1) starts introduction sequence on port 5058
nbd-server: server (-2) opened port 5058 (socket 8) for client 10.10.10.10
nbd-server: server (-1) set default signal handlers for session server 6268
nbd-server: server (-1) read passwd ok
nbd-server: server (-1) got cliserv magic ok
|-nbd-client(7878)
nbd-server: server (-1) received id device 2b00 ok
nbd-server: server (-1) sent size 8388608 ok
nbd-server: server (-1) sent sig ok
nbd-server: server (-1) suggested ro flags 0 ok
nbd-server: server (-1) received blksize 4096 ok
nbd-server: server (-1) sent/negotiated blksize 4096 ok
nbd-server: server (-1) received pulse_intvl 5 ok
nbd-server: server (-1) sent/negotiated pulse interval 10 ok
nbd-server: server (-1) agreed 4 channels ok
nbd-server: server (-1) selected free port at 5059
nbd-server: server (-1) posted port 5059 ok
nbd-client: client (-1) got session port 5059 ok
nbd-client: client (-1) introduction sequence ends ok
nbd-client: setkernel client (-1) set device ro flag 0
checking 127.0.0.1
checking 11.11.11.11
checking 10.10.10.10
nbd-server: server (-1) manager started new process group 6268
nbd-server: server (3) set default signal handlers for slave server 6272
nbd-server: server (2) set default signal handlers for slave server 6271
nbd-server: server (3) opened port 5059 (socket 10) for client 10.10.10.10
nbd-server: server (2) opened port 5059 (socket 10) for client 10.10.10.10
nbd-server: server (2) set new signal handlers for slave server 6271
nbd-client: client (0) begins main loop
nbd-client: client (1) begins main loop
nbd-server: server (1) set default signal handlers for slave server 6270
nbd-server: server (1) opened port 5059 (socket 10) for client 10.10.10.10
nbd-server: server (1) set new signal handlers for slave server 6270
nbd-client: client (2) begins main loop
nbd-server: server (0) set default signal handlers for slave server 6269
nbd-server: server (0) opened port 5059 (socket 10) for client 10.10.10.10
nbd-server: server (0) set new signal handlers for slave server 6269
nbd-client: client (3) begins main loop
nbd-server: server (-1) set new signal handlers for session server 6268
nbd-server: server (3) set new signal handlers for slave server 6272
machineA%
----------------- End of log of "make test" ----------------------------