[Linux-ha-dev] Large delays in sending ordered HA messages

Andrew Beekhof lists at beekhof.net
Mon Oct 11 02:54:03 MDT 2004


On Oct 9, 2004, at 11:10 PM, Guochun Shi wrote:

> there is a corner case which is not covered by the fix:
>
> Client A sends one message(order_seq =1) and exits; Then it is started 
> again, the first message could be
> delayed, then a receiving client will see a message with (order_seq 
> =2) and it will happily deliver it.
>
> The point is a receiving client cannot tell if a message comes from a 
> normal client or a restarted client.
>
> When each client joins or leaves, the heartbeat will broadcast a 
> join/leave message to the cluster, however
> this message cannot be used as mark of start/end of a client in the 
> receiving client side since this message
> could be delayed.
>
> One way to solve this problem is to add a client generation number in 
> each client->cluster message. Heartbeat
> need to maintain a data structure(hashtable?) for each type of client. 
> Using this client generation number, a receiving
> client can easily tell if a client has restarted.
>
> any comment?

does this happen only if order_seq=1 or was that just an example?

>
> thanks
> -Guochun
>
>
> At 03:01 PM 10/8/2004 -0500, you wrote:
>> I commited the fix. Please let me know if you find anything not 
>> working
>>
>> thanks
>> -Guochun
>>
>> At 12:43 PM 10/7/2004 +0200, you wrote:
>>> The CRM currently suffers this exact problem if nodes leave and come 
>>> back, so it will certainly benefit from the fix.  Nice work on the 
>>> neatness of the fix too.
>>>
>>> Looking forward to the update :)
>>>
>>> andrew
>>>
>>> On Oct 6, 2004, at 9:36 PM, Alan Robertson wrote:
>>>
>>>> Guochun Shi wrote:
>>>>> At 11:42 AM 10/6/2004 +0200, you wrote:
>>>>>> On 2004-10-05T23:27:23, Guochun Shi <gshi at ncsa.uiuc.edu> wrote:
>>>>>>
>>>>>>
>>>>>>> hi,
>>>>>>>
>>>>>>> I have a simple way to accomplish the goal: let's say the last
>>>>>>> heartbeat seq number recorded for a node (node A) in heartbeat is
>>>>>>> seq_last, then any ordered message from node A with a heartbeat 
>>>>>>> seq
>>>>>>> number > seq_last can be delivered to an application immediately 
>>>>>>> ---
>>>>>>> this is the first ordered message delivered.  Later ordered 
>>>>>>> message
>>>>>>> delivery can be computed through the ordered seq. The reason the 
>>>>>>> first
>>>>>>> ordered message can be delivered in that way is that any message 
>>>>>>> with
>>>>>>> heartbeat seq number > seq_last will be received (unless 
>>>>>>> heartbeat
>>>>>>> retransmitting mechanism is broken) therefore we will not miss 
>>>>>>> any
>>>>>>> later ordered message. Messages with seq less than seq_last ---- 
>>>>>>> out
>>>>>>> ordered or  retransmitted messages-- will be discarded.
>>>>>>
>>>>>> Of course, you can only do this for the first ordered message 
>>>>>> from a
>>>>>> node after a startup (or after a node returns after having 
>>>>>> dropped out
>>>>>> of membership).
>>>>> Once we are done with the first ordered message,  we can order 
>>>>> subsequent messages using the ordered-seq number.
>>>>>> Right?
>>>>>>
>>>>>> But yes, this should work.
>>>>
>>>> This sounds like a simple fix to a problem that looked complicated 
>>>> when I last looked at it.
>>>>
>>>> Good Job Guochun!
>>>>
>>>> Please code this up and test it.  Check for memory leaks (obviously 
>>>> ;-)), and then commit it.
>>>>
>>>>        Thanks!
>>>>
>>>>
>>>> -- 
>>>>    Alan Robertson <alanr at unix.sh>
>>>>
>>>> "Openness is the foundation and preservative of friendship...  Let 
>>>> me claim from you at all times your undisguised opinions." - 
>>>> William Wilberforce
>>>> _______________________________________________________
>>>> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>>> Home Page: http://linux-ha.org/
>>> -- 
>>> Andrew Beekhof
>>>
>>> "Ooo Ahhh, Glenn McRath" - TISM
>>>
>>> _______________________________________________________
>>> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>> Home Page: http://linux-ha.org/
>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
-- 
Andrew Beekhof

"Ooo Ahhh, Glenn McRath" - TISM



More information about the Linux-HA-Dev mailing list