[OCF] question about membership in Event Notification API
Guochun Shi
gshi at ncsa.uiuc.edu
Wed Mar 9 10:42:46 MST 2005
Hi, list,
I am maintaining Cluster Membership Consensus (CCM) which was written by Ram Pai. I have a few quesions about membership events.
In [OCF]Event Notification API Proposal (draft 3)
"
typedef enum {
OC_EV_MS_INVALID =3D OC_EV_SET_CLASS(OC_EV_MEMB_CLASS, 0),
OC_EV_MS_NEW_MEMBERSHIP,
OC_EV_MS_NOT_PRIMARY,
OC_EV_MS_PRIMARY_RESTORED,
OC_EV_MS_EVICTED
} oc_memb_event_t;
Membership Events:
-----------------
OC_EV_MS_NEW_MEMBERSHIP is delivered to nodes in the primary
sub-cluster (active node membership) when a membership change
occurs.
OC_EV_MS_NOT_PRIMARY is delivered to nodes when membership
agreement is no longer possible and this node can not
accurately determine if it is part of the primary sub-cluster
(active node membership). For example, this event might be
delivered in a HiAv cluster to nodes that have lost quorum.
OC_EV_MS_PRIMARY_RESTORED is delivered when connectivity is
restored after a transient outage and membership returns to the
exact same state as it was before the OC_EV_NOT_PRIMARY event.
OC_EV_MS_EVICTED is delivered when connectivity is restored and
a new primary sub-cluster (active node membership) has been
accepted elsewhere in the cluster which no longer includes the
local node. If delivered, this will be the last event delivered
to the called function, and membership notification service
terminates.
Applications are not expected to gracefully recover from this
event. Usually, there is too much invalid or stale state
that must be flushed.
An implementation may choose to handle eviction in its own
way, and NOT deliver this event. Most implementations will
reboot or be killed by their peers. Delivery of this event
is optional for implementations that handle eviction by
alternate means, such as STONITH...
NOTE: no attempt has been made to allow re-connection of
an evicted member node.
"
Since I see some difference in CCM implementation and the draft description, I want to make sure I understand it correctly.
1. In CCM, a membership without quorum is delivered as OC_EV_MS_INVALID but interpreted in client side as "NO QUORUM MEMBERSHIP"
event. According the draft, it should be OC_EV_MS_NOT_PRIMARY event.
2. What's the purpose of having OC_EV_MS_PRIMARY_RESTORED event? We can always deliver an OC_EV_MS_NEW_MEMBERSHIP if connectivity is restored (which means some nodes join because they have left us when the connectivity is lost). I don't see much usefulness of this event in client side.
thanks
-Guochun
More information about the OCF
mailing list