(/Stage[main]/Galera/Service[mysql-service]/ensure) change from stopped to running failed: execution expired

Bug #1368605 reported by Anastasia Palkina
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Medium
Sergii Golovatiuk

Bug Description

"build_id": "2014-09-11_01-04-40", "ostf_sha": "1de6ed1c0b72f6687ffb4bebc2c939b135a88e34", "build_number": "3", "auth_required": true, "api": "1.0", "nailgun_sha": "720e83bca37561fbc0452ad4e99f1f8cfe8e40cf", "production": "docker", "fuelmain_sha": "d899675a5a393625f8166b29099d26f45d527035", "astute_sha": "b622d9b36dbdd1e03b282b9ee5b7435ba649e711", "feature_groups": ["experimental"], "release": "5.1", "release_versions": {"2014.1.1-5.1": {"VERSION": {"build_id": "2014-09-11_01-04-40", "ostf_sha": "1de6ed1c0b72f6687ffb4bebc2c939b135a88e34", "build_number": "3", "api": "1.0", "nailgun_sha": "720e83bca37561fbc0452ad4e99f1f8cfe8e40cf", "production": "docker", "fuelmain_sha": "d899675a5a393625f8166b29099d26f45d527035", "astute_sha": "b622d9b36dbdd1e03b282b9ee5b7435ba649e711", "feature_groups": ["experimental"], "release": "5.1", "fuellib_sha": "6fc7ac9041894aa76b2e18d385149166e34f7b23"}}}, "fuellib_sha": "6fc7ac9041894aa76b2e18d385149166e34f7b23"

1. Create new environment (Ubuntu, HA mode)
2. Choose VLAN segmentation
3. Choose both ceph
4. Choose rados
5. Add 3 controller+ceph, 2 compute+ceph
6. Start deployment. It has failed
7. There are errors on second controller in puppet.log (node-4):

2014-09-11 12:20:27 ERR

 (/Stage[main]/Galera/Service[mysql-service]/ensure) change from stopped to running failed: execution expired

Revision history for this message
Anastasia Palkina (apalkina) wrote :
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

Digging the issue I found that galera was not able to perform SST during normal amount of time. This issue happens in very rare conditions under very high load or when we have several deployments in parallel.

Changed in fuel:
milestone: none → 6.0
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

1. Create new environment (Ubuntu, HA mode)
2. Choose VLAN segmentation
3. Choose both ceph
4. Choose rados
5. Add 3 controller+ceph, 2 compute+ceph
6. Start deployment.

Add second environment. Deploy them in parallel

Changed in fuel:
status: New → Confirmed
Revision history for this message
Kirill Omelchenko (komelchenko) wrote :

Have reproduced:

Configuration:
- HA, NovaFlat, CinderLVM
- 3x Controllers
- 1x Compute
- 1x Storage

Scenario:
1. Start deployment
2. Wait till first controller setup status goes to ready
3. Stop the deployment
4. Wait for all nodes but the first controller go to 'Pending addition'
5. Deploy changes

Deployment fails with puppet errors http://paste.openstack.org/show/111679/ on the first controller

Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

According to the log, time on the nodes is different causing the issue with Galera sync

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.