[LinuxFailSafe] crsd problem

rf@q-leap.de rf@q-leap.de
Mon, 23 Jun 2003 11:35:39 +0200


>>>>> "Lars" == Lars Marowsky-Bree <lmb@suse.de> writes:

    Lars> On 2003-06-06T21:43:06, rf@q-leap.de said:

    >> We have a problem with the crsd daemon. What happens is that always
    >> after a certain amount of time (approx. 6 days) after the crsd has
    >> started, there is a problem with its ipc communication. This can have
    >> the unfortunate effect, that resetting will not work anymore when a
    >> failover has to be done. The fact that this always happens after the
    >> same time period suggests that some integer counter is overflowing.

    Lars> Does anything cleanout the /tmp directory and remove the IPC
    Lars> socket...?

No, the ipc file is still there after the error message (it is not a socket but
a mmapped file):

-rwx------    1 root     root         8220 Jun  4 19:14 /var/run/failsafe/comm/crsd-ipc_ha-test-1