[OCF] question about membership in Event Notification API

Guochun Shi gshi at ncsa.uiuc.edu
Wed Mar 9 10:42:46 MST 2005


Hi, list,

I am maintaining Cluster Membership Consensus (CCM)  which was written by Ram Pai. I have a few quesions about membership events.

In [OCF]Event Notification API Proposal (draft 3) 

"
     typedef enum {
                OC_EV_MS_INVALID =3D OC_EV_SET_CLASS(OC_EV_MEMB_CLASS, 0),
                OC_EV_MS_NEW_MEMBERSHIP,
                OC_EV_MS_NOT_PRIMARY,
                OC_EV_MS_PRIMARY_RESTORED,
                OC_EV_MS_EVICTED
        } oc_memb_event_t;


        Membership Events:
        -----------------

        OC_EV_MS_NEW_MEMBERSHIP is delivered to nodes in the primary
        sub-cluster (active node membership) when a membership change
        occurs.

        OC_EV_MS_NOT_PRIMARY is delivered to nodes when membership
        agreement is no longer possible and this node can not
        accurately determine if it is part of the primary sub-cluster
        (active node membership).  For example, this event might be
        delivered in a HiAv cluster to nodes that have lost quorum.

        OC_EV_MS_PRIMARY_RESTORED is delivered when connectivity is
        restored after a transient outage and membership returns to the
        exact same state as it was before the OC_EV_NOT_PRIMARY event.

        OC_EV_MS_EVICTED is delivered when connectivity is restored and
        a new primary sub-cluster (active node membership) has been
        accepted elsewhere in the cluster which no longer includes the
        local node.  If delivered, this will be the last event delivered
        to the called function, and membership notification service
        terminates.

        Applications are not expected to gracefully recover from this
        event.  Usually, there is too much invalid or stale state
        that must be flushed.

        An implementation may choose to handle eviction in its own
        way, and NOT deliver this event.  Most implementations will
        reboot or be killed by their peers.  Delivery of this event
         is optional for  implementations that handle eviction by
        alternate means, such as STONITH...
        NOTE: no attempt has been made to allow re-connection of
        an evicted member node.


"
 Since I see some difference in CCM implementation and the draft description, I want to make sure I understand it correctly.

1. In CCM,  a membership without quorum is delivered as OC_EV_MS_INVALID but interpreted in client side as "NO QUORUM MEMBERSHIP"
event. According the draft, it should be OC_EV_MS_NOT_PRIMARY event.

2. What's the purpose of having OC_EV_MS_PRIMARY_RESTORED event? We can always deliver an OC_EV_MS_NEW_MEMBERSHIP if connectivity is restored (which means some nodes join because they have left us when the connectivity is lost). I don't see much usefulness of this event in client side.

thanks
-Guochun








More information about the OCF mailing list