hot-standby services: WAS How do I set up a resource as "secondary only"?

Alan Robertson alanr@unix.sh
Tue, 15 Oct 2002 14:55:23 -0600


This discussion is slightly academic, but will hopefully be somewhat helpful.

We currently implement failover services in linux-ha.

I define the term failover service to be a "cold standby" application.  That 
is, the service runs on exactly one machine at a time.

There is a second kind of service that one can define (telecom folks do 
this) called hot-standby services.

A hot standby service tries to always run on every "eligible" node in the 
cluster at once.  But it does not run in the same mode on every node in the 
cluster.

One node is called the active node, and the other nodes are called the 
standby nodes.

Such services are started in standby mode when the cluster service comes up. 
  Rather than ensuring that no more than one node in the cluster is running 
the service, the cluster manager must ensure that no more than one node is 
running as primary for the given service.

Instead of stopping the service, the cluster manager switches the service to 
standby.

The operations on such a resource might be:

	standby	 - begin service in / switch service to standby mode
	active   - begin service in / switch service to active mode
	stop	 - stop running service completely
	status	 - return active, standby or stopped
	monitor  - is service running correctly in the current mode
	canprimary - can this service become primary now?
	cansecondary - can this service become secondary now?

As I understand it, such a service would meet Nathan's need.  It would also 
work nicely for DRBD, rsync, and other replication-type services.

I believe that such services are useful, and if implemented this way could 
be completely understood by the cluster manager, and not be a kludge or 
mystery of any kind as it is now.

LMB, Ragnar: I think we should consider this model for the OCF.

	-- Alan Robertson
	   alanr@unix.sh