[Linux-ha-dev] Info: CRM Memory Leaks

Andrew Beekhof abeekhof at suse.de
Fri Feb 9 08:43:50 MST 2007


As many of you may have already read, there are/were a number of  
significant memory leaks in the CRM.

First of all, don't panic.

The bad news is that they've been there for some time, however it  
also means that if you've not noticed any ill-effects so far, then  
you're unlikely to do so in the future.

The size of the leaks are proportional to:
* cluster size
* number of resources
* cluster activity (ie. resources restarting, or nodes joining/ 
leaving the cluster)

If there is no cluster activity, then no further memory will be leaked.


== WORST CASE SCENARIO ==

The worst case scenario is that the crmd process uses up all the  
available memory and is shot.
In such cases, the master heartbeat process will simply respawn us  
and we'll re-join the cluster.

If you're _really_ unlucky, heartbeat will be a fraction too slow  
doing this and (if STONITH is enabled), the node may be fenced (to  
ensure the node is dead before starting it's resources).


== WHAT NOW ==

Thanks to Valgrind, I've fixed all but the smallest leaks - so the  
next version will be as leak free as I can make it (excepting library  
functions that I can't do anything about).

For those that want/need the leaks fixed NOW, I've attached a patch  
against 2.0.8


== NEVER AGAIN ==

The embarrassment of such a leak being present has prompted me to  
make running the CRM under Valgrind exceedingly easy.  Simply give  
the --enable-valgrind option to configure and all 5 CRM processes  
will complain bitterly if they're leaking memory (with a stack trace  
of who allocated it!).  Just remember to start a Valgrind lister on  
localhost:1234.


When running tools like Valgrind, please remember that there are  
results that I cannot do anything about. Eg:
* library functions that leak every time they're called
* library functions that create "global" data and offer no way to  
clean it up^

For this reason, I have created a Valgrind suppression file which I  
can make available if people are interested.

^ Technically this isn't a leak, since its a fixed size regardless of  
the number of times a function is called, but it does make spotting  
real leaks harder.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: crm-leaks.patch
Type: application/octet-stream
Size: 16011 bytes
Desc: not available
Url : http://lists.community.tummy.com/pipermail/linux-ha-dev/attachments/20070209/d24e4d8a/crm-leaks-0001.obj


More information about the Linux-HA-Dev mailing list