[ENBD] 20G dd hang up!

Peter T. Breuer enbd@lists.community.tummy.com
Tue, 8 Jan 2002 16:13:16 +0100 (MET)


"A month of sundays ago Kuniyasu SUZAKI wrote:"
>  >>He's in trouble already here.  The TEST must succeed.  There is no point
>  >>in proceding until the test can be run.  Maybe he means that he
>  >>succeeded with a lolcalhost to localhost test, and then attempted local
>  >>to remote but didn't get anywahere.
>  >>
>  >>If that is the case he needs to make it clear and investigate.
> 
> Yes. The situation was caused by host machine.  It seemed that the

I'm afraid I don't uderstand that "yes". Do you mean that the "make
test" succeeds in localhost to localhost configuration? Please be very
exact. I must know.

After maketest succeeds in localhost to localhost configuration, you
must then get it to succeed in local to remote configuration. I am 99%
certain that you have it working in that configuration and that you are
reporting long-term, not short-term, errors, but I must be sure. So
please let me know.

> write request by "dd" seemed too heavy.  the client machine became

"Seems" is unfortunately not a word I can work with.

> overload and it hung up. 

"hung up" is also not a precisely defined term! Do you mean the network
became sluggish and tcp began to time out? Or do you mean that the
client machine locked solid - i.e. deadlocked or became stuck in some
internal kernel loop?

> Do we need to control the write request with timeout option (-t) or
> something?

You do not.  The situation you describe is an error situation that must
be resolved before you can do anything.  But in any case I am fairly
sure from your earlier reports that you are using the wrong version of
enbd for me to be able to help you.  You will need the latest version:
2.4.26a and 2.4.27pre1 for me to be able to compare notes.
^^^^^^^^^^^^^^^^^^^^^^

>  >>> What network cards are you using? I have had a whole load of similar
>  >>> problems with netgear fa311a cards which I've now pulled out of every
>  >>> machine I have - almost anything seems to be better than those cards
>  >>> which just drop packets.
>  >>
>  >>Interesting. But it should be covered up by tcp, surely?
> 
> We used 100M Ether NIC(3Com 3c905C Tornado). The performance measured
> by "netperf" was approximately 95 Mbps. It was nice but it was still
> narrow for "dd" write request.

I'm afraid I don't quite understand what you are telling me here! I think
you are saying that 95Mb/s is the limiting factor on dd perfermance?
I.e. the disks at either end are faster than 10MB/s in practice?


What kernel are you using?  What version of enbd?  What is the client
machine (#cpus, disks, ..)?

Peter