Comment 4 for bug 1527330

Revision history for this message
Oleksandr Martsyniuk (omartsyniuk) wrote :

I've investigated the problem and found details that may be useful for other plugin developers.

I have done the investigation of astute logs provided by support team and noticed that running update_core_repos task triggered the execution of not only post-deployment tasks from plugin, but pre-deployment tasks too.

In our case the problem was with netconfig pre-deployment task.
Plugin pre-deployment tasks for base-os nodes include calling netconfig task. Looks like that keepalived service that provides custom contrail VIPs was not able to work properly after netconfig run. Contrail cluster was broken due to unavailability of VIPs with contrail service endpoints.

To workaround this problem, it was decided to restart keepalived after each netconfig run.
A change to plugin code was introduced, which includes update to plugins puppet manifests to restart the keepalived on each run. This ensures that keepalived will continue to serve the VIPs after re-running netconfig task. You may review the code at https://review.openstack.org/#/c/260469. This change is proposed to plugins stable/2.1 branch.