[OCF] Re: [Linux-ha-dev] Adding "reload" to the OCF specification
Alan Robertson
alanr at unix.sh
Thu Jun 15 09:58:37 MDT 2006
Lars Marowsky-Bree wrote:
> On 2006-06-15T08:56:00, Alan Robertson <alanr at unix.sh> wrote:
>
>> Many LSB init scripts implement a 'reload' action which permits them to
>> reread their configurations without interrupting service.
>>
>> By design, OCF spec is upwards-compatible with the LSB.
>>
>> I think it would be good to specifically add the reload operation to the
>> OCF spec.
>>
>> Saying something like this:
>> ----------------------------------------------------------------------
>> The reload operation is an optional operation which can be supported by
>> OCF resource agents. This operation will cause the resource to examine
>> its parameters and its configuration files, and continue running these
>> new configuration values, without interrupting service in a way which is
>> visible to resources which depend on it.
>>
>> If an OCF resource agent wishes to support the reload operation, it is
>> required to list it in the <operations> section of the metadata given by
>> the meta-data operation.
>>
>> Even though a resource supports a reload operation, a conforming cluster
>> manager need not make use of it. Of course, if it does, then service
>> updates can be made with fewer service interruptions, so this is likely
>> to be seen as a desirable feature.
>> ----------------------------------------------------------------------
>>
>> And as to whether there are resource agents which could make use of this
>> feature - the answer is yes. I have a customer who would like such a
>> capability today. At the moment, they go far out of their way to work
>> around not having it.
>>
>> As written, this is optional for both resource agents and cluster
>> managers, and it seems like a reasonable addition to the OCF spec.
>>
>> My guess is that this would be an easy addition for many cluster
>> managers to support. Of course, nothing is impossible to he who doesn't
>> have to do it.
>
>
> 1. The administrator of course may not change the parameters so much
> that the RA can no longer identify the already running resource
> instance. (Do we need to provide hints in the meta data to identify
> which parameters are safe to change and which are not?)
Good question...
One possible answer...
If the RA can't identify if after they're changed, then it shouldn't
claim to support it ;-)
Another possible answer:
IIRC, some of the RA parameters are marked <unique> in the metadata
indicating that together they uniquely identify the resource. If you
changed any of those, then it would need to be stopped then restarted
instead. [I'm aware that this would cause us (heartbeat/CRM) some more
difficulty in implementing it].
> 2. If reload fails, should the resource be treated as failed, or merely
> the reload? (I'm inclined towards the first one, it's easier to code I
> think ;-)
I agree with your assessment.
> 3. You say "without this being visible to the resources depending on
> it", but, then, if it is not visible at _all_ (ie, _absolutely_ no
> observable change _anywhere_), there wouldn't be a point in reloading
> it, would there?
What I meant was without it requiring the resources depending on it to
be restarted. That's a better phrase I think... Of course, it has some
effect, but it shouldn't affect the dependent resources negatively.
> I think in the interest of purity & simplicity, this question could be
> phrased as: "If it doesn't have an impact on our view of the cluster,
> why do we need to care - can't this be done outside our control then?"
No. If it involves an RA parameter, those have to be updated - through -
the system. That's the case for the customer I have in mind. What
they're doing now would curl your hair - even as short as you keep your
hair ;-)
> It's somewhat similar to the question as to whether or not we should
> have a "yellow" monitor result: Yes, it would go some way to make the
> cluster more of a general management tool, but it is also not strictly
> necessary; the stance you took back then was that you didn't want that,
> if I wasn't mistaken. We left this health monitoring to more dedicated
> apps. Continuing that line of thought seems to suggest that we don't
> need "reload" either.
This DOES change things - in real life. Stopping and restarting a
resource is sometimes a HUGE (30 minute) job. The ability to avoid this
is a big deal in some cases.
> I don't want to nit-pick here, but I'd like to understand whether your
> thinking has changed in general here, in which case I'd like to re-raise
> the point of the yellow "warning" monitor status too ;-) Or in what case
> this feature is different?
I don't see the connection between these two -- at all.
One is 'I know I need to notify the RA that something has changed, but I
want to avoid a half-hour needed to restart the resource and everything
which depends on it'. This is a real issue. With real impacts, and a
clear real usefulness. And, it's done at an administrator's request.
If you want to have the red-green-yellow argument for resources again,
we can have it again if you really want to. I'm clearer on it than I
was last time we had that discussion. But, it is TOTALLY unrelated to
this issue, and there is absolutely no reason to drag it in to muddy the
waters here.
This proposal should be considered on its own merits, not by false
association with some other proposal. That's a form of guilt by (false)
association.
--
Alan Robertson <alanr at unix.sh>
"Openness is the foundation and preservative of friendship... Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
More information about the OCF
mailing list