HA: setup-vnc-galera failed also sql db is in lockwait on two openstack nodes

Bug #1542558 reported by shajuvk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.21.x
Fix Committed
Medium
Sanju Abraham
R2.22.x
Fix Committed
Medium
Sanju Abraham
R3.0
Fix Committed
Medium
Sanju Abraham
Trunk
Fix Committed
Medium
Sanju Abraham

Bug Description

setup_vnc_galera failed in 3 nodes HA setup.

sql db on first two openstack nodes are in lock state
mysql> show databases;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql>

root@a5s6:~# tail -f /var/log/mysql/error.log
160205 17:20:22 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)
160205 17:20:22 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)
160205 17:20:22 [Warning] WSREP: gcs_caused() returned -103 (Software caused connection abort)

setup.log:
========
2016-02-05 16:45:28:800035: [root@10.84.14.7] out: [localhost] local: mysql --defaults-file=/etc/mysql/my.cnf -uroot -p732c5a68c88390aa4b4c -e "FLUSH PRIVILEGES"
2016-02-05 16:45:28:807709: [root@10.84.14.7] out: [localhost] local: echo "0 * * * * /opt/contrail/bin/contrail-token-clean.sh" >> /tmp/tmp7iRD7p/galera_cron
2016-02-05 16:45:28:815356: [root@10.84.14.7] out: [localhost] local: crontab /tmp/tmp7iRD7p/galera_cron
2016-02-05 16:45:28:816788: [root@10.84.14.7] out: [localhost] local: rm /tmp/tmp7iRD7p/galera_cron
2016-02-05 16:45:28:824417: [root@10.84.14.7] out: [localhost] local: ls -l /var/lib/mysql/ib_logfile1 | awk '{print $5}'
2016-02-05 16:45:28:824608: [root@10.84.14.7] out: [localhost] local: mysql -h192.168.10.1 -uroot -p732c5a68c88390aa4b4c -e "show global status where variable_name='wsrep_local_state'" | awk '{print $2}' | sed '1d'
2016-02-05 16:45:28:832266: [root@10.84.14.7] out: Waiting for first galera node to create new cluster.
2016-02-05 16:45:40:850753: [root@10.84.14.7] out: Traceback (most recent call last):
2016-02-05 16:45:40:851050: [root@10.84.14.7] out: File "/usr/bin/setup-vnc-galera", line 9, in <module>
2016-02-05 16:45:40:851392: [root@10.84.14.7] out: load_entry_point('ContrailProvisioning==0.1dev', 'console_scripts', 'setup-vnc-galera')()
2016-02-05 16:45:40:851484: [root@10.84.14.7] out: File "/usr/local/lib/python2.7/dist-packages/contrail_provisioning/openstack/ha/galera_setup.py", line 344, in main
2016-02-05 16:45:40:851571: [root@10.84.14.7] out: galera.setup()
2016-02-05 16:45:40:852033: [root@10.84.14.7] out: File "/usr/local/lib/python2.7/dist-packages/contrail_provisioning/common/base.py", line 304, in setup
2016-02-05 16:45:40:852127: [root@10.84.14.7] out: self.run_services()
2016-02-05 16:45:40:867750: [root@10.84.14.7] out: File "/usr/local/lib/python2.7/dist-packages/contrail_provisioning/openstack/ha/galera_setup.py", line 327, in run_services
2016-02-05 16:45:40:868173: [root@10.84.14.7] out: raise RuntimeError("Unable able to bring up galera in first node, please verify and continue.")
2016-02-05 16:45:40:868828: [root@10.84.14.7] out: RuntimeError: Unable able to bring up galera in first node, please verify and continue.
2016-02-05 16:45:40:868978: [root@10.84.14.7] out:
2016-02-05 16:45:40:869279:

2016-02-05 16:45:40:870853: Fatal error: sudo() received nonzero return code 1 while executing!
2016-02-05 16:45:40:870853:
2016-02-05 16:45:40:870853: Requested: setup-vnc-galera --self_ip 192.168.10.2 --keystone_ip 192.168.10.201 --galera_ip_list 192.168.10.1 192.168.10.2 192.168.10.3 --internal_vip 192.168.10.201 --openstack_index 2 --zoo_ip_list 192.168.10.1 192.168.10.3 192.168.10.4 --keystone_user keystone --keystone_pass keystone --cmon_user cmon --cmon_pass cmon --monitor_galera True --external_vip 10.84.14.215
2016-02-05 16:45:40:870853: Executed: sudo -S -p 'sudo password:' /bin/bash -l -c "cd /opt/contrail/bin && setup-vnc-galera --self_ip 192.168.10.2 --keystone_ip 192.168.10.201 --galera_ip_list 192.168.10.1 192.168.10.2 192.168.10.3 --internal_vip 192.168.10.201 --openstack_index 2 --zoo_ip_list 192.168.10.1 192.168.10.3 192.168.10.4 --keystone_user keystone --keystone_pass keystone --cmon_user cmon --cmon_pass cmon --monitor_galera True --external_vip 10.84.14.215"
2016-02-05 16:45:40:870853:
2016-02-05 16:45:40:870898: Aborting.
2016-02-05 16:45:40:870898: 2016-02-05 16:45:40:870705: Disconnecting from 10.84.24.68... done.
2016-02-05 16:45:40:985139: Disconnecting from 10.84.14.8... done.
2016-02-05 16:45:40:988674: Disconnecting from 10.84.14.6... done.
2016-02-05 16:45:41:102751: Disconnecting from 10.84.7.18... done.
2016-02-05 16:45:41:166796: Disconnecting from 10.84.24.231... done.
2016-02-05 16:45:41:280853: Disconnecting from 10.84.24.229... done.
2016-02-05 16:45:41:394863: Disconnecting from 10.84.24.230... done.
2016-02-05 16:45:41:508896: Disconnecting from 10.84.14.7... done.
2016-02-05 16:45:41:572869: root@a5s6:/opt/contrail/utils#

Tags: blocker ha vmware
shajuvk (shajuvk)
information type: Proprietary → Public
tags: added: blocker
shajuvk (shajuvk)
tags: removed: blocker
Revision history for this message
shajuvk (shajuvk) wrote :

Initial setup all failed due to some other issue but the second time when we ran setup_all sql db's are locked

tags: added: blocker
Revision history for this message
Sanju Abraham (asanju) wrote :

Issue seems to be due to cleaning up of redo log without stopping mysql. This will lead to issues.

I have the fix for this but need a setup to verify. Will send an email to Shaju and Venu to reproduce it and test the fix before commit.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/17592
Submitter: Sanju (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/17593
Submitter: Sanju (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/17594
Submitter: Sanju (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/17595
Submitter: Sanju (<email address hidden>)

Revision history for this message
Sanju Abraham (asanju) wrote :

Issue during rerun was because of the in-correct sequence of initialization with gcomm string set to init. Besides this the run_services, install_mysql was cleaning up resources which are required for mysql startup.

Fix is provided to initialize mysql and wsrep conf and cleanup resources such that there is no lock wait timeouts.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17592
Committed: http://github.org/Juniper/contrail-provisioning/commit/137e2b398b71b9e5b732842b9b1760626b736dcf
Submitter: Zuul
Branch: R3.0

commit 137e2b398b71b9e5b732842b9b1760626b736dcf
Author: Sanju Abraham <email address hidden>
Date: Mon Feb 22 14:52:42 2016 -0800

Closes-Bug: #1542558. This bug fixes the issues of mysql galera not being initialized / bootstrapped on setup_all re-runs

Change-Id: I2062266c2aee3f2a7756b3d3ed664fcf8780e281

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/17593
Committed: http://github.org/Juniper/contrail-provisioning/commit/549109c8476a00ddd98f50bb98ea2226253fe7b7
Submitter: Zuul
Branch: master

commit 549109c8476a00ddd98f50bb98ea2226253fe7b7
Author: Sanju Abraham <email address hidden>
Date: Mon Feb 22 14:55:01 2016 -0800

Closes-Bug: #1542558. This bug fixes the issues of mysql galera not being initialized / bootstrapped on setup_all re-runs

Change-Id: I726d81e5bea626f662340bd4389da08fbe05bfeb

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/17595
Committed: http://github.org/Juniper/contrail-provisioning/commit/b86e6763ed99f7efbb0c86d3b3692ba565b8d0fb
Submitter: Zuul
Branch: R2.22.x

commit b86e6763ed99f7efbb0c86d3b3692ba565b8d0fb
Author: Sanju Abraham <email address hidden>
Date: Mon Feb 22 14:56:49 2016 -0800

Closes-Bug: #1542558. This bug fixes the issues of mysql galera not being initialized / bootstrapped on setup_all re-runs

Change-Id: Ic89a513642891ba2ff61e4c73a971978a8e11f70

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/17594
Committed: http://github.org/Juniper/contrail-provisioning/commit/d73affdaa236a787c8a9e0b5f660fb8e16dd8a0a
Submitter: Zuul
Branch: R2.21.x

commit d73affdaa236a787c8a9e0b5f660fb8e16dd8a0a
Author: Sanju Abraham <email address hidden>
Date: Mon Feb 22 14:56:03 2016 -0800

Closes-Bug: #1542558. This bug fixes the issues of mysql galera not being initialized / bootstrapped on setup_all re-runs

Change-Id: I1943b0c62bb714a667672ff2b0da6eb4eb32d84d

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.