[Linux-ha-dev] Large delays in sending ordered HA messages
Andrew Beekhof
lists at beekhof.net
Mon Oct 11 02:54:03 MDT 2004
On Oct 9, 2004, at 11:10 PM, Guochun Shi wrote:
> there is a corner case which is not covered by the fix:
>
> Client A sends one message(order_seq =1) and exits; Then it is started
> again, the first message could be
> delayed, then a receiving client will see a message with (order_seq
> =2) and it will happily deliver it.
>
> The point is a receiving client cannot tell if a message comes from a
> normal client or a restarted client.
>
> When each client joins or leaves, the heartbeat will broadcast a
> join/leave message to the cluster, however
> this message cannot be used as mark of start/end of a client in the
> receiving client side since this message
> could be delayed.
>
> One way to solve this problem is to add a client generation number in
> each client->cluster message. Heartbeat
> need to maintain a data structure(hashtable?) for each type of client.
> Using this client generation number, a receiving
> client can easily tell if a client has restarted.
>
> any comment?
does this happen only if order_seq=1 or was that just an example?
>
> thanks
> -Guochun
>
>
> At 03:01 PM 10/8/2004 -0500, you wrote:
>> I commited the fix. Please let me know if you find anything not
>> working
>>
>> thanks
>> -Guochun
>>
>> At 12:43 PM 10/7/2004 +0200, you wrote:
>>> The CRM currently suffers this exact problem if nodes leave and come
>>> back, so it will certainly benefit from the fix. Nice work on the
>>> neatness of the fix too.
>>>
>>> Looking forward to the update :)
>>>
>>> andrew
>>>
>>> On Oct 6, 2004, at 9:36 PM, Alan Robertson wrote:
>>>
>>>> Guochun Shi wrote:
>>>>> At 11:42 AM 10/6/2004 +0200, you wrote:
>>>>>> On 2004-10-05T23:27:23, Guochun Shi <gshi at ncsa.uiuc.edu> wrote:
>>>>>>
>>>>>>
>>>>>>> hi,
>>>>>>>
>>>>>>> I have a simple way to accomplish the goal: let's say the last
>>>>>>> heartbeat seq number recorded for a node (node A) in heartbeat is
>>>>>>> seq_last, then any ordered message from node A with a heartbeat
>>>>>>> seq
>>>>>>> number > seq_last can be delivered to an application immediately
>>>>>>> ---
>>>>>>> this is the first ordered message delivered. Later ordered
>>>>>>> message
>>>>>>> delivery can be computed through the ordered seq. The reason the
>>>>>>> first
>>>>>>> ordered message can be delivered in that way is that any message
>>>>>>> with
>>>>>>> heartbeat seq number > seq_last will be received (unless
>>>>>>> heartbeat
>>>>>>> retransmitting mechanism is broken) therefore we will not miss
>>>>>>> any
>>>>>>> later ordered message. Messages with seq less than seq_last ----
>>>>>>> out
>>>>>>> ordered or retransmitted messages-- will be discarded.
>>>>>>
>>>>>> Of course, you can only do this for the first ordered message
>>>>>> from a
>>>>>> node after a startup (or after a node returns after having
>>>>>> dropped out
>>>>>> of membership).
>>>>> Once we are done with the first ordered message, we can order
>>>>> subsequent messages using the ordered-seq number.
>>>>>> Right?
>>>>>>
>>>>>> But yes, this should work.
>>>>
>>>> This sounds like a simple fix to a problem that looked complicated
>>>> when I last looked at it.
>>>>
>>>> Good Job Guochun!
>>>>
>>>> Please code this up and test it. Check for memory leaks (obviously
>>>> ;-)), and then commit it.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> --
>>>> Alan Robertson <alanr at unix.sh>
>>>>
>>>> "Openness is the foundation and preservative of friendship... Let
>>>> me claim from you at all times your undisguised opinions." -
>>>> William Wilberforce
>>>> _______________________________________________________
>>>> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>>> Home Page: http://linux-ha.org/
>>> --
>>> Andrew Beekhof
>>>
>>> "Ooo Ahhh, Glenn McRath" - TISM
>>>
>>> _______________________________________________________
>>> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>> Home Page: http://linux-ha.org/
>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
--
Andrew Beekhof
"Ooo Ahhh, Glenn McRath" - TISM
More information about the Linux-HA-Dev
mailing list