[OCF] Re: question about membership in Event Notification API

Guochun Shi gshi at ncsa.uiuc.edu
Wed Mar 9 13:36:10 MST 2005


Ram, thanks for quick reply

At 11:44 AM 3/9/2005 -0800, you wrote:
>On Wed, 2005-03-09 at 09:42, Guochun Shi wrote:
>> Hi, list,
>> 
>> I am maintaining Cluster Membership Consensus (CCM)  which was written by Ram Pai. I have a few quesions about membership events.
>> 
>> In [OCF]Event Notification API Proposal (draft 3) 
>> 
>> "
>>      typedef enum {
>>                 OC_EV_MS_INVALID =3D OC_EV_SET_CLASS(OC_EV_MEMB_CLASS, 0),
>>                 OC_EV_MS_NEW_MEMBERSHIP,
>>                 OC_EV_MS_NOT_PRIMARY,
>>                 OC_EV_MS_PRIMARY_RESTORED,
>>                 OC_EV_MS_EVICTED
>>         } oc_memb_event_t;
>> 
>> 
>>         Membership Events:
>>         -----------------
>> 
>>         OC_EV_MS_NEW_MEMBERSHIP is delivered to nodes in the primary
>>         sub-cluster (active node membership) when a membership change
>>         occurs.
>> 
>>         OC_EV_MS_NOT_PRIMARY is delivered to nodes when membership
>>         agreement is no longer possible and this node can not
>>         accurately determine if it is part of the primary sub-cluster
>>         (active node membership).  For example, this event might be
>>         delivered in a HiAv cluster to nodes that have lost quorum.
>> 
>>         OC_EV_MS_PRIMARY_RESTORED is delivered when connectivity is
>>         restored after a transient outage and membership returns to the
>>         exact same state as it was before the OC_EV_NOT_PRIMARY event.
>> 
>>         OC_EV_MS_EVICTED is delivered when connectivity is restored and
>>         a new primary sub-cluster (active node membership) has been
>>         accepted elsewhere in the cluster which no longer includes the
>>         local node.  If delivered, this will be the last event delivered
>>         to the called function, and membership notification service
>>         terminates.
>> 
>>         Applications are not expected to gracefully recover from this
>>         event.  Usually, there is too much invalid or stale state
>>         that must be flushed.
>> 
>>         An implementation may choose to handle eviction in its own
>>         way, and NOT deliver this event.  Most implementations will
>>         reboot or be killed by their peers.  Delivery of this event
>>          is optional for  implementations that handle eviction by
>>         alternate means, such as STONITH...
>>         NOTE: no attempt has been made to allow re-connection of
>>         an evicted member node.
>> 
>> 
>> "
>>  Since I see some difference in CCM implementation and the draft description, I want to make sure I understand it correctly.
>> 
>> 1. In CCM,  a membership without quorum is delivered as OC_EV_MS_INVALID but interpreted in client side as "NO QUORUM MEMBERSHIP"
>> event. According the draft, it should be OC_EV_MS_NOT_PRIMARY event.
>
>brushing my rather old memory: there is a difference between
>OC_EV_MS_INVALID and OC_EV_MS_NOT_PRIMARY
>
>OC_EV_MS_INVALID: means I am exactly sure that I am not part of any
>membership.

In CCM, OC_EV_MS_INVALID is overloaded as no-quorum memberhip, if the special flag is set to true.


>OC_EV_MS_NOT_PRIMARY: means I am in a transient state and I not exactly
>sure about the status of my membership. A event of this kind will either
>have a follow-up event saying OC_EV_MS_INVALID or
>OC_EV_MS_PRIMARY_RESTORED
>

OK, this membership event looks more like an event that invalidate previous membership.

In
"
OC_EV_MS_NOT_PRIMARY is delivered to nodes when membership
         agreement is no longer possible and this node can not
         accurately determine if it is part of the primary sub-cluster
         (active node membership).  For example, this event might be
         delivered in a HiAv cluster to nodes that have lost quorum.
"
the line "For example, this event might be delivered in a HiAv cluster to nodes that have lost quorum"  is very confusing. 
Because I thought non-quorum node will always be delivered an OC_EV_MS_NOT_PRIMAY event.

What events should mebership deliever in the following case:

a). Total 3 nodes, only one node is running
b). Total 3 nodes, 3 nodes are running
c)   Total 3 nodes n1,n2,n3,  the communication between n2 and n3 is broken. The primary membership is either (n1,n2) or (n1,n3)
let's it is (n1,n3), what membership event should  n1 get? What membership should n2 get?






>> 
>> 2. What's the purpose of having OC_EV_MS_PRIMARY_RESTORED event? We can always deliver an OC_EV_MS_NEW_MEMBERSHIP if connectivity is restored (which means some nodes join because they have left us when the connectivity is lost). I don't see much usefulness of this event in client side.
>> 
>
>No. if you loose connectivity with the rest of the cluster, and later
>gain back connectivity, and realize that the rest of the cluster has
>gone through further membership transition, than essentially say you are
>evicted out of the cluster OC_EV_MS_EVICTED. But however if the rest of
>the cluster
>has not gone through further transition and they still see you as
>belonging to the current membership, then you just treat yourself as
>being part of the membership.  So you send OC_EV_MS_PRIMARY_RESTORED 
>event as a follow-up to the OC_EV_MS_NOT_PRIMARY event.

If you lose connectivity and become a self member  cluster and later gain connectivity and find the cluster still has u in membership, the cluster is probably ill-functioning.
The edge case is that you detect lose connectivity early than the separated cluster, then before the separated cluster knows it lost connectivity and trigger its membership protocol, 
the connectivity is regained and u detected it again before other nodes does -- the chances of this happening is nearly very small, nearly impossible.

Furthurmore, defining regain connectivity could be tricky since you need to communicate with nodes for that. Simply trigger a new round of protocol is simpler IMHO. 

thanks
-Guochun



More information about the OCF mailing list