Mysql deadlock during deployment

Bug #1431702 reported by Dmitry Ukov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Triaged
High
Fuel Library (Deprecated)
5.0.x
Triaged
High
Fuel Library (Deprecated)
5.1.x
Triaged
High
Fuel Library (Deprecated)
6.0.x
Triaged
High
Fuel Library (Deprecated)
6.1.x
Triaged
High
Fuel Library (Deprecated)

Bug Description

VERSION:
  feature_groups:
    - experimental
  production: "docker"
  release: "6.0"
  api: "1.0"
  build_number: "30"
  build_id: "2015-03-11_23-55-26"
  astute_sha: "16b252d93be6aaa73030b8100cf8c5ca6a970a91"
  fuellib_sha: "81d750341d31f396f0eb37f990fd2a7f67451a74"
  ostf_sha: "a9afb68710d809570460c29d6c3293219d3624d4"
  nailgun_sha: "4da25deb487d07fa641bf2b0d33cce2ab65b70a3"
  fuelmain_sha: "81d38d6f2903b5a8b4bee79ca45a54b76c1361b8"

1. Create HA cluster with Nova network
2. Add 3 controllers 1 compute
3. Hit deploy changes button
4. During deploymetn puppet log for on of the controllers will show the following error
 (/Stage[main]/Nova::Db::Mysql/Nova::Db::Mysql::Host_access[node-39]/Database_user[nova@node-39]/ensure) change from absent to present failed: Execution of '/usr/bin/mysql mysql -e create user 'nova'@'node-39' identified by PASSWORD '*EC8ABB8C668BF55C0FCEF4CDB011E6902FC750B6'' returned 1: ERROR 1213 (40001) at line 1: Deadlock found when trying to get lock; try restarting transaction

What can be the reason of such behavior?

Tags: ha
Revision history for this message
Dmitry Ukov (dukov) wrote :
Stanislav Makar (smakar)
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
importance: Undecided → Low
status: New → Confirmed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

It is possible that this patch https://review.openstack.org/#/c/123244 should resolve the issue. There is no related bug in commit message, for some strange reason, but it should be related to https://bugzilla.redhat.com/show_bug.cgi?id=1141972.
MOS Oslo team, could you please verify this fix?

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

As for the deployment perspective, everything looks all right http://paste.openstack.org/show/192094/
So, this issue should be related to Oslo.db

Revision history for this message
Dmitry Ukov (dukov) wrote :

Galera has fallen apart a few seconds before this error message. It seems that this is network related issue:
<6>Mar 13 14:02:27 node-74 kernel: ixgbe 0000:01:00.0: eth0: detected SFP+: 4
<6>Mar 13 14:02:27 node-74 kernel: ixgbe 0000:01:00.0: eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
<6>Mar 13 14:02:27 node-74 kernel: bonding: bond0: link status definitely down for interface eth0, disabling it
<4>Mar 13 14:02:27 node-74 kernel: bonding: bond0: Warning: No 802.3ad response from the link partner for any adapters in the bond
<6>Mar 13 14:02:27 node-74 kernel: ixgbe 0000:01:00.1: eth1: detected SFP+: 3
<6>Mar 13 14:02:27 node-74 kernel: bond0: link status definitely up for interface eth0, 10000 Mbps full duplex.
<6>Mar 13 14:02:27 node-74 kernel: bonding: bond0: link status definitely down for interface eth1, disabling it
<6>Mar 13 14:02:27 node-74 kernel: ixgbe 0000:01:00.1: eth1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
<6>Mar 13 14:02:27 node-74 kernel: bond0: link status definitely up for interface eth1, 10000 Mbps full duplex.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Dmitry, the nodes related for this issue are 31-40. node-74 is an artifact from other deployments, not related to this issue

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The issue is that puppet-mysql providers should be able to handle deadlocks and retry the failed operations. This issue should be first reported and fixed in upstream and next backported for Fuel

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
Dmitry Ukov (dukov) wrote :

Sounds absolutely right!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/168831

tags: added: ha
Jay Pipes (jaypipes)
summary: - Mysql deadlock duiring deployment
+ Mysql deadlock during deployment
Revision history for this message
Jay Pipes (jaypipes) wrote :

This should not happen if the Nova::Db::Mysql/Nova::Db::Mysql::Host_access[Database_user] puppet module commands are only run on a single controller node. The fix for this is not really to tinker with wsrep_retry_autocommit, but rather to ensure that the above Puppet module/command is only ever run on a single node.

Same goes for the db_sync commands.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Thank you for the feedback, @Jay. I guess this issue is a duplicate of https://bugs.launchpad.net/fuel/+bug/1330875 then. And the patch https://review.openstack.org/116895 should address it as well

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

(but the patch should be extended to the Nova, glance, cinder and other entities)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (master)

Change abandoned by Bogdan Dobrelya (<email address hidden>) on branch: master
Review: https://review.openstack.org/168831

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.