There are two issues with the current implementation of watch rule timers, which trigger periodic evaluation of watch rules (which are used by the CWLiteAlarm resource).
Long term we need to remove all the watch rule stuff an mandate use of ceilometer instead, but short term we have the following issues:
1. The watch tasks are started after the fork when multiple workers processes are specified. In practice this appears to not result in duplicate tasks, because of the global StackWatch object and common ThreadGroupManager, but it's suboptimal and could have undesired side-effects - we only ever want one watch task as StackWatch doesn't use the stack lock.
2. There is no way to globally disable the watch tasks. This is an issue for multi-engine deployments, where due to the aforementioned lack of locking, every engine will race each other running the same watch tasks for every stack. The solution is to globally disable watch tasks and mandate that only ceilometer alarms be used in a scaled out deployment (which makes sense anyway, given the crude and unscalable implementation of heat internal alarming).
Currently, if someone creates a stack containing a CWLiteAlarm resource when running multiple engines, things will appear to work, but are highly likely to be racy, so we should add a switch to disable watch tasks and document that they should be disabled when running more than one heat-engine process.
Hi shardy,
ThreadGroupManager created in create_ periodic_ tasks which is triggered first is overritten by service start's own ThreadGroupManager. I guess, there needs a check in EngineService.start before creating ThreadGroupManager again. StackWatch has its own disconnected ThreadGroupManager.