[Linux-ha-dev] 2.0.8 bug in handling of clock wrap in longclock.c

Lars Marowsky-Bree lmb at suse.de
Mon Jun 18 08:18:08 MDT 2007


On 2007-06-18T14:54:42, David Lee <t.d.lee at durham.ac.uk> wrote:

> But it sounds as though "time_longclock()" DOES change data; that it has a
> "write" side-effect.  If so, then "cl_log()" should not be calling an
> anticipated read-only function (such as "time_longclock()") in the first
> place, should it?

Well, I'd argue it more the other way around, but for the same reason -
time_longclock() shouldn't call cl_log(), because it's quite obvious
that cl_log needs to generate a timestamp.

However, the fix does avoid that recursion, by moving the possibly
recursive call to a location where the need for recursion has been
removed.

> There is ambiguity about "time_longclock()".  The declaration in
> "include/clplumbing/longclock.h" simply says:
>        Returns current time as a longclock_t.
> 
> That, to me, sounds "read-only".  But clearly, it has "write" side-effects
> of some sort.
> 
> The description and action of "time_longclock()" are thus mutually
> contradictory.
> 
> Shouldn't THAT be the issue that we try to fix?
> 
> Glancing through the overall source code suggests that time_longclock()'s
> many callers are generally expecting an innocent, read-only behaviour, not
> something that has data-change side-effects.
> 
> Perhaps "time_longclock()" should become truly read-only.  If some sort of
> data-change is required by a subset of its callers, then perhaps a
> parallel function (or a flag to the same function) should be designed, for
> use specifically by that subset of its current callers which really do
> need to effect some sort of change.

It _is_ required, for essentially all callers - wrapping of the
longclock type must be handled, and for that, cl_longclock must keep
(and modify) state.

cl_log() also shouldn't have any side effects, in an ideal world -
cl_longclock shouldn't have to expect to be recursively called, no? To
the callers, it must look like neither one of them does. However, we're
deep within interdependent core code here, and while we can present
that impression to external callers, internally we have further
considerations to take into account.

I think the fix we're looking at is actually pretty close to the best
one, short of removing the cl_log() message there completely. (Who cares
about the wrap? ;-)

Introducing a special set of APIs is not going to be any less pleasant,
because then we have to track which API to call. We've had similar
issues with cl_log being called within the IPC socket code.



Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde



More information about the Linux-HA-Dev mailing list