[Linux-ha-dev] New master/slave resource agent for DB2 databases in HADR (High Availability Disaster Recovery) mode
Lars Marowsky-Bree
lmb at novell.com
Wed Feb 9 08:01:46 MST 2011
On 2011-02-09T15:35:01, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
> > So don't do that :-)
> > Put up a wiki page with instructions for how to download+use the new
> > agent and give feedback.
>
> How about a staging area?
> /usr/lib/ocf/resource.d/staging/
>
> we can also add a
> /usr/lib/ocf/resource.d/deprecated/
I like the idea of putting infrastructure in place to ease canary
testing.
But we probably don't even need the symlink; if they've got to change
something anyway, they can as well change the provider from "stable" to
"staging" (and back, if they want to).
If the RA is really backwards compatible, they can do that even w/o
service outage - this should cause a reprobe, and if all is well, the
new RA will just attach to the existing instance.
However, see below.
> Once settled, we copy over the staging one to the "real" directory,
> replacing the "original" one, and add a "please fix your config" to the
> thing that remains in staging/, so we will be able to start a further
> rewrite with the next "merge window".
This bit I don't like so much: yes, they should explicitly need to
accept that they're participating in the canary testing of a new
version.
What I'd like to avoid is for them to have to change anything when we
promote the canary-tested version to "stable".
Two ideas:
- If we do this via symlinks - e.g., if they switch to the staging
branch for a RA, we adjust the symlink -, they don't need to change
anything. We can adjust the symlink back automatically once that
version has become the stable one.
Downside - it's a per-node change, and possibly another point of
divergence in the cluster. Positive is that it is a per-node change
and that rolling upgrades are more readily possible.
- Alternatively, we can add a "version" meta-attribute that is parsed by
the lrmd/OCF plugin to chose which one to run.
In practice though, both approaches and the whole approach are not
without complexity.
I'd really feel more comfortable if we could instead get reasonable test
coverage so that we don't have to be afraid. The "canary testing" is the
icing on the top, but not a replacement for automatic regression
testing.
Regards,
Lars
--
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
More information about the Linux-HA-Dev
mailing list