[Linux-ha-dev] total ordering protocol in multicast
gshi at ncsa.uiuc.edu
Wed Jun 9 12:59:58 MDT 2004
In Monday's meeting we discussed Spread ,TIPC, and the possibility they can be used in heartbeat. Here I want to continue discussion on total ordering protocol
Generally there are two familis of protocols
The first one is so-called "pre-transmission" protocols in that one node needs to contend for an ordering capability to order messages. The Amoeba system, Totem protocol and Isis all belongs to this category.
The Amoeba system has a central controller, every message is sent to it via point to point communication and the central controller put a global sequence number in it and
multicast it. The Totem protocol uses a revolving token that holds a sequence-number for messages. The holder of the token can emit one or more multicast messages and update
the token sequence number accordingly. The Isis ABCAST protocol also employs a token holder within each group of communication processes. ABCAST messages are multicast at will, and their delivery is delayed by all receiving processes except for the token holder. Periodically, the token holder sends a message indicating its order of delivery for all received ABCAST messages, and all the other processes comply with it.
Spread is similar to Totem protocol according to Bob's description.
The advantage of "pre-transmission" protocol is that once the ordering capability is obtained, it is simple and efficient to order all messages. The disadvantage is that it has a bottleneck, either the token or the central-controller. ABCAST and Abmoeba also require extra messages to place an order to messages.
The other category is so-called "post-transmission" protocols that are completely distributed algorithms that build a total order from the local information and reach agreement. Trans/total and Transis belong to this. In Trans protocol ACKs are added into the next message one node is to multicast, thus eliminate pure ACK messages. ACK for a message is only sent once. If one node get an ACK for a message but have not get the message, an NACK will be send out. The causal ordering is constructed by chains of ACKs and NACKs. A simple example:
A1 is send out(A denotes process, 1 is the local seq number)
B1, a1 is send out (a1 denotes ACK for messageA1)
upon receiving the message C1,b1 and no NACK is send out, we can be sure that this node have received all A1 B1 C1 messages because C1 ACKs B1 and B1 ACKs A1. Besides that we have a casual ordering of the message A1->B1->C1
Totol is the protocol to compute a total order based the causal order computed in the Trans, it is complicated but requires no extra messages.
Transis is a derived protocol from the Trans/total; it slightly modifies trans protocol and uses a different method (still complicated) to compute a total order.
The advantage of post-transmission is that every node can transmit at will and compute the total order based on the local received messages. No extra message is required thus save bandwidth. The disadvantage is that it needs messages from some number of nodes ( this number is portion of the total node numbers) to compute the total order thus there is a delay between receive and delivery of a message
comments are welcome
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Linux-HA-Dev