[Linux-HA] Problem with function send_ordered_nodemsg
Andrew Beekhof
beekhof at gmail.com
Wed Jun 18 12:59:22 MDT 2008
have you considered trying the openais messaging stack instead?
the crm runs on it equally as well.
On Wed, Jun 18, 2008 at 17:04, Audet, Jean-Michel
<Jean-Michel.Audet at ca.kontron.com> wrote:
> Hi,
> I already sent this message and never get any feedback. Here is my problem.
>
> I have hearbeat 2.1.3 (Same problem with 2.1.2).
> I am using a Master/Slave model.
>
> I am using the communication link of heartbeat to transfer data from 2 nodes. Data is state and data. Since, with Ethernet, I am limited in size, I am transferring multiple chunks of 8K data for up to 1MB (120 * 8KB approx).
>
> The problem is after couple of data set (maybe 300, 400, sometime more, sometime less... but always), the function send_ordered_nodemsg hang and I am not able to transfer data anymore. It looks, from debug information that it hangs in function socket_resume_io_read.
>
> I have tried Unicast and Broadcast.
>
> >From Dejan, it maybe that I am pushing heartbeat communication layer to the limit. I am a little bit surprise that 1MB of data can be a problem.
>
> I am stuck now and I need a solution cause my application is not usable and I may have to look at other ha package (I really don't want to).
>
> Any input, suggestions, whatever will be greatly appreciated.
> May it be good to consider creating a new communication link (client/server).
>
> Jean-Michel Audet
> Kontron Canada
>
>
> -----Message d'origine-----
> De : Audet, Jean-Michel
> Envoyé : Thursday, June 05, 2008 11:13 AM
> À : 'General Linux-HA mailing list'
> Objet : Problem with function send_ordered_nodemsg
>
>
> Hi,
> I currently have a problem with my software that hangs when I call the function send_ordered_nodemsg (exhibit the same problem with sendnodemsg). I am able to send many message (many dozens) and then, it hangs. With extra debug, I found that it hangs somewhere in the function socket_resume_io_read. I base my code on the CIB implementation.
>
>
> I am requesting any helps that may help me find the problem. I know that CIB is using this function so I think the problem is on my side or I don't know exactly how to use it but I am trying to find this problem since many days now.
>
> Maybe somebody have some experience with his function and hit the same problem before.
>
> Any help will be more than appreciated.
>
> Jean-Michel Audet
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
More information about the Linux-HA
mailing list