Comment 3 for bug 1645474

Revision history for this message
Anton Chevychalov (achevychalov) wrote :

That is the real scenario and real cause of that problem:

Facts:

1. We have primary-openstack-controller task.
2. That job is repeatable. If it fail it will be re-triggered.
3. That job triggers Nova installation and configuration
4. Installation and configuration triggers sync-db.
5. sync-db can't repeat step if something goes wrong with connection to DB.

So that is the fail scenario:

1. sync-db was working while wsrep was starting and re-balancing nodes.
2. mysql node on primary controller had failed while sync-db was running
3. sync-db was not able to recover and failed
4. primary-openstack-controller task was failed too.
5. primary-openstack-controller task was triggered one more time.
6. nova installation was not triggered because nova is already installed and configured.
7. sync-db was not triggered too because it depends on nova installation process.
8. primary-openstack-controller task was succeed because there where not fails.
9. so deployment has success state too
10. but nova database had not filled with data and was broken.

Possible solutions:
1. Allways trigger sync-db on primary-openstack-controller.
2. Add check about current db state.
3. Somehow fix wrep and mysql starting procedure.
4. Fix sync-db to repeat steps.