[OCF] Re: question about membership in Event Notification API
Ram
linuxram at us.ibm.com
Wed Mar 9 12:44:02 MST 2005
On Wed, 2005-03-09 at 09:42, Guochun Shi wrote:
> Hi, list,
>
> I am maintaining Cluster Membership Consensus (CCM) which was written by Ram Pai. I have a few quesions about membership events.
>
> In [OCF]Event Notification API Proposal (draft 3)
>
> "
> typedef enum {
> OC_EV_MS_INVALID =3D OC_EV_SET_CLASS(OC_EV_MEMB_CLASS, 0),
> OC_EV_MS_NEW_MEMBERSHIP,
> OC_EV_MS_NOT_PRIMARY,
> OC_EV_MS_PRIMARY_RESTORED,
> OC_EV_MS_EVICTED
> } oc_memb_event_t;
>
>
> Membership Events:
> -----------------
>
> OC_EV_MS_NEW_MEMBERSHIP is delivered to nodes in the primary
> sub-cluster (active node membership) when a membership change
> occurs.
>
> OC_EV_MS_NOT_PRIMARY is delivered to nodes when membership
> agreement is no longer possible and this node can not
> accurately determine if it is part of the primary sub-cluster
> (active node membership). For example, this event might be
> delivered in a HiAv cluster to nodes that have lost quorum.
>
> OC_EV_MS_PRIMARY_RESTORED is delivered when connectivity is
> restored after a transient outage and membership returns to the
> exact same state as it was before the OC_EV_NOT_PRIMARY event.
>
> OC_EV_MS_EVICTED is delivered when connectivity is restored and
> a new primary sub-cluster (active node membership) has been
> accepted elsewhere in the cluster which no longer includes the
> local node. If delivered, this will be the last event delivered
> to the called function, and membership notification service
> terminates.
>
> Applications are not expected to gracefully recover from this
> event. Usually, there is too much invalid or stale state
> that must be flushed.
>
> An implementation may choose to handle eviction in its own
> way, and NOT deliver this event. Most implementations will
> reboot or be killed by their peers. Delivery of this event
> is optional for implementations that handle eviction by
> alternate means, such as STONITH...
> NOTE: no attempt has been made to allow re-connection of
> an evicted member node.
>
>
> "
> Since I see some difference in CCM implementation and the draft description, I want to make sure I understand it correctly.
>
> 1. In CCM, a membership without quorum is delivered as OC_EV_MS_INVALID but interpreted in client side as "NO QUORUM MEMBERSHIP"
> event. According the draft, it should be OC_EV_MS_NOT_PRIMARY event.
brushing my rather old memory: there is a difference between
OC_EV_MS_INVALID and OC_EV_MS_NOT_PRIMARY
OC_EV_MS_INVALID: means I am exactly sure that I am not part of any
membership.
OC_EV_MS_NOT_PRIMARY: means I am in a transient state and I not exactly
sure about the status of my membership. A event of this kind will either
have a follow-up event saying OC_EV_MS_INVALID or
OC_EV_MS_PRIMARY_RESTORED
>
> 2. What's the purpose of having OC_EV_MS_PRIMARY_RESTORED event? We can always deliver an OC_EV_MS_NEW_MEMBERSHIP if connectivity is restored (which means some nodes join because they have left us when the connectivity is lost). I don't see much usefulness of this event in client side.
>
No. if you loose connectivity with the rest of the cluster, and later
gain back connectivity, and realize that the rest of the cluster has
gone through further membership transition, than essentially say you are
evicted out of the cluster OC_EV_MS_EVICTED. But however if the rest of
the cluster
has not gone through further transition and they still see you as
belonging to the current membership, then you just treat yourself as
being part of the membership. So you send OC_EV_MS_PRIMARY_RESTORED
event as a follow-up to the OC_EV_MS_NOT_PRIMARY event.
RP
> thanks
> -Guochun
>
>
>
>
>
>
More information about the OCF
mailing list