Activity log for bug #1624013

Date Who What changed Old value New value Message
2016-09-15 15:45:29 Bob Ball bug added bug
2016-09-15 15:45:29 Bob Ball attachment added fuel-snapshot-2016-09-15_14-58-14.tar.gz https://bugs.launchpad.net/bugs/1624013/+attachment/4741479/+files/fuel-snapshot-2016-09-15_14-58-14.tar.gz
2016-09-15 17:00:42 Bob Ball attachment added fuel-snapshot-2016-09-15_16-56-42.tar.gz https://bugs.launchpad.net/fuel/+bug/1624013/+attachment/4741517/+files/fuel-snapshot-2016-09-15_16-56-42.tar.gz
2016-09-15 19:27:28 Evgeniya Shumakher fuel: importance Undecided High
2016-09-15 19:27:37 Evgeniya Shumakher fuel: milestone 9.1
2016-09-15 19:27:51 Evgeniya Shumakher tags customer-found
2016-09-16 01:32:05 Andrey Danin bug added subscriber Andrey Danin
2016-09-16 10:58:23 Dmitry Pyzhov fuel: assignee MOS Linux (mos-linux)
2016-09-19 13:02:17 Vitaly Sedelnik fuel: status New Confirmed
2016-09-19 13:02:23 Vitaly Sedelnik fuel: milestone 9.1 9.2
2016-09-19 14:58:25 Ivan Suzdal fuel: status Confirmed Fix Committed
2016-09-21 08:29:05 Bob Ball fuel: status Fix Committed Confirmed
2016-09-21 10:26:07 Bob Ball description Detailed bug description: MOS 9 environment cannot deploy due to mysql crashing failures Puppet logs for the failed controller say: (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0] (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) Failed to call refresh: mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0] (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]/returns) ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111) Logging in to failed controller node shows that, indeed, mysql is not running: root@node-4:~# service mysql status mysql stop/waiting /var/log/mysql/error.log is attached, and shows a segfault occurring, possibly from the wsrep post commit function: 14:22:43 UTC - mysqld got signal 11 ; stack_bottom = 7fbebb747e88 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x7fbebbf81b7c] /usr/sbin/mysqld(handle_fatal_signal+0x3c2)[0x7fbebbcd25c2] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fbeba9c7330] /usr/sbin/mysqld(thd_get_ha_data+0xc)[0x7fbebbd1f54c] /usr/sbin/mysqld(_Z20thd_binlog_trx_resetP3THD+0x2e)[0x7fbebbf2c79e] /usr/sbin/mysqld(_Z17wsrep_post_commitP3THDb+0xcc)[0x7fbebbe0c32c] /usr/sbin/mysqld(_Z12trans_commitP3THD+0x6f)[0x7fbebbdf2dcf] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x38d1)[0x7fbebbd60851] /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x3c8)[0x7fbebbd649d8] /usr/sbin/mysqld(+0x508c24)[0x7fbebbd64c24] /usr/sbin/mysqld(_Z19do_handle_bootstrapP3THD+0x111)[0x7fbebbd64ff1] /usr/sbin/mysqld(+0x509060)[0x7fbebbd65060] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fbeba9bf184] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbeba0e237d] Steps to reproduce: Not sure which steps are needed, but my environment has: 3x Controller (4 CPU, 6GB RAM, 80GB HDD) 3x Qemu Compute/Cinder/Ceph-OSD (2 CPU, 1GB RAM, 50GB HDD) Each host has two interfaces - PXE (eth0) and a VLAN network (eth1). Public network is on a VLAN over eth1, and Neutron is also configured to use VLANs This has been reproduced several times on different hardware and with different Fuel 9 installations, with different Ubuntu repositories and with the XenServer plugin disabled as well as enabled. MD5 sum of ISO has been confirmed: # md5sum MirantisOpenStack-9.0.iso 07461ba42d5056830dd6f203e8fe9691 MirantisOpenStack-9.0.iso Expected results: Deployment succeeds :) Actual result: Deployment fails with the above errors :) Reproducibility: Deployments are occasionally successful, but once a deployment is successful it is not possible to add a new controller node as adding a controller fails 100% of the time. Workaround: None known Impact: Fatal; but also preventing the validation of the XenServer plugin for MOS 9 as this issue also occurs with the plugin installed. Fuel snapshot is attached, with a second snapshot (with the XenServer plugin enabled) at https://citrix.sharefile.com/d-s4a809f3542947818 Detailed bug description:  MOS 9 environment cannot deploy potentially due to mysql crashing. (Edit: It seems that the mysql crash below could be unrelated to the failures to deploy) Puppet logs for the failed controller say:  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0]  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) Failed to call refresh: mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0]  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]/returns) ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111) Logging in to failed controller node shows that, indeed, mysql is not running: root@node-4:~# service mysql status mysql stop/waiting /var/log/mysql/error.log is attached, and shows a segfault occurring, possibly from the wsrep post commit function: 14:22:43 UTC - mysqld got signal 11 ; stack_bottom = 7fbebb747e88 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x7fbebbf81b7c] /usr/sbin/mysqld(handle_fatal_signal+0x3c2)[0x7fbebbcd25c2] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fbeba9c7330] /usr/sbin/mysqld(thd_get_ha_data+0xc)[0x7fbebbd1f54c] /usr/sbin/mysqld(_Z20thd_binlog_trx_resetP3THD+0x2e)[0x7fbebbf2c79e] /usr/sbin/mysqld(_Z17wsrep_post_commitP3THDb+0xcc)[0x7fbebbe0c32c] /usr/sbin/mysqld(_Z12trans_commitP3THD+0x6f)[0x7fbebbdf2dcf] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x38d1)[0x7fbebbd60851] /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x3c8)[0x7fbebbd649d8] /usr/sbin/mysqld(+0x508c24)[0x7fbebbd64c24] /usr/sbin/mysqld(_Z19do_handle_bootstrapP3THD+0x111)[0x7fbebbd64ff1] /usr/sbin/mysqld(+0x509060)[0x7fbebbd65060] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fbeba9bf184] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbeba0e237d] Steps to reproduce:  Not sure which steps are needed, but my environment has:   3x Controller (4 CPU, 6GB RAM, 80GB HDD)   3x Qemu Compute/Cinder/Ceph-OSD (2 CPU, 1GB RAM, 50GB HDD)  Each host has two interfaces - PXE (eth0) and a VLAN network (eth1).  Public network is on a VLAN over eth1, and Neutron is also configured to use VLANs This has been reproduced several times on different hardware and with different Fuel 9 installations, with different Ubuntu repositories and with the XenServer plugin disabled as well as enabled. MD5 sum of ISO has been confirmed: # md5sum MirantisOpenStack-9.0.iso 07461ba42d5056830dd6f203e8fe9691 MirantisOpenStack-9.0.iso Expected results:  Deployment succeeds :) Actual result:  Deployment fails with the above errors :) Reproducibility:  Deployments are occasionally successful, but once a deployment is successful it is not possible to add a new controller node as adding a controller fails 100% of the time. Workaround:  None known Impact:  Fatal; but also preventing the validation of the XenServer plugin for MOS 9 as this issue also occurs with the plugin installed. Fuel snapshot is attached, with a second snapshot (with the XenServer plugin enabled) at https://citrix.sharefile.com/d-s4a809f3542947818
2016-09-21 10:26:36 Bob Ball description Detailed bug description:  MOS 9 environment cannot deploy potentially due to mysql crashing. (Edit: It seems that the mysql crash below could be unrelated to the failures to deploy) Puppet logs for the failed controller say:  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0]  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) Failed to call refresh: mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0]  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]/returns) ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111) Logging in to failed controller node shows that, indeed, mysql is not running: root@node-4:~# service mysql status mysql stop/waiting /var/log/mysql/error.log is attached, and shows a segfault occurring, possibly from the wsrep post commit function: 14:22:43 UTC - mysqld got signal 11 ; stack_bottom = 7fbebb747e88 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x7fbebbf81b7c] /usr/sbin/mysqld(handle_fatal_signal+0x3c2)[0x7fbebbcd25c2] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fbeba9c7330] /usr/sbin/mysqld(thd_get_ha_data+0xc)[0x7fbebbd1f54c] /usr/sbin/mysqld(_Z20thd_binlog_trx_resetP3THD+0x2e)[0x7fbebbf2c79e] /usr/sbin/mysqld(_Z17wsrep_post_commitP3THDb+0xcc)[0x7fbebbe0c32c] /usr/sbin/mysqld(_Z12trans_commitP3THD+0x6f)[0x7fbebbdf2dcf] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x38d1)[0x7fbebbd60851] /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x3c8)[0x7fbebbd649d8] /usr/sbin/mysqld(+0x508c24)[0x7fbebbd64c24] /usr/sbin/mysqld(_Z19do_handle_bootstrapP3THD+0x111)[0x7fbebbd64ff1] /usr/sbin/mysqld(+0x509060)[0x7fbebbd65060] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fbeba9bf184] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbeba0e237d] Steps to reproduce:  Not sure which steps are needed, but my environment has:   3x Controller (4 CPU, 6GB RAM, 80GB HDD)   3x Qemu Compute/Cinder/Ceph-OSD (2 CPU, 1GB RAM, 50GB HDD)  Each host has two interfaces - PXE (eth0) and a VLAN network (eth1).  Public network is on a VLAN over eth1, and Neutron is also configured to use VLANs This has been reproduced several times on different hardware and with different Fuel 9 installations, with different Ubuntu repositories and with the XenServer plugin disabled as well as enabled. MD5 sum of ISO has been confirmed: # md5sum MirantisOpenStack-9.0.iso 07461ba42d5056830dd6f203e8fe9691 MirantisOpenStack-9.0.iso Expected results:  Deployment succeeds :) Actual result:  Deployment fails with the above errors :) Reproducibility:  Deployments are occasionally successful, but once a deployment is successful it is not possible to add a new controller node as adding a controller fails 100% of the time. Workaround:  None known Impact:  Fatal; but also preventing the validation of the XenServer plugin for MOS 9 as this issue also occurs with the plugin installed. Fuel snapshot is attached, with a second snapshot (with the XenServer plugin enabled) at https://citrix.sharefile.com/d-s4a809f3542947818 Detailed bug description:  MOS 9 environment cannot deploy potentially due to mysql crashing. (Edit: It seems that the mysql crash from /var/log/mysql/error.log could be unrelated to the failures to deploy due to mysql being inaccessible) Puppet logs for the failed controller say:  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0]  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]) Failed to call refresh: mysql -uclustercheck -pOObsCqCTtkLkRHK52n0H0N8O -Nbe "show status like 'wsrep_local_state_comment'" | grep -q -e Synced && sleep 10 returned 1 instead of one of [0]  (/Stage[main]/Cluster::Mysql/Exec[wait-initial-sync]/returns) ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111) Logging in to failed controller node shows that, indeed, mysql is not running: root@node-4:~# service mysql status mysql stop/waiting /var/log/mysql/error.log is attached, and shows a segfault occurring, possibly from the wsrep post commit function: 14:22:43 UTC - mysqld got signal 11 ; stack_bottom = 7fbebb747e88 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x7fbebbf81b7c] /usr/sbin/mysqld(handle_fatal_signal+0x3c2)[0x7fbebbcd25c2] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fbeba9c7330] /usr/sbin/mysqld(thd_get_ha_data+0xc)[0x7fbebbd1f54c] /usr/sbin/mysqld(_Z20thd_binlog_trx_resetP3THD+0x2e)[0x7fbebbf2c79e] /usr/sbin/mysqld(_Z17wsrep_post_commitP3THDb+0xcc)[0x7fbebbe0c32c] /usr/sbin/mysqld(_Z12trans_commitP3THD+0x6f)[0x7fbebbdf2dcf] /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x38d1)[0x7fbebbd60851] /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x3c8)[0x7fbebbd649d8] /usr/sbin/mysqld(+0x508c24)[0x7fbebbd64c24] /usr/sbin/mysqld(_Z19do_handle_bootstrapP3THD+0x111)[0x7fbebbd64ff1] /usr/sbin/mysqld(+0x509060)[0x7fbebbd65060] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fbeba9bf184] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbeba0e237d] Steps to reproduce:  Not sure which steps are needed, but my environment has:   3x Controller (4 CPU, 6GB RAM, 80GB HDD)   3x Qemu Compute/Cinder/Ceph-OSD (2 CPU, 1GB RAM, 50GB HDD)  Each host has two interfaces - PXE (eth0) and a VLAN network (eth1).  Public network is on a VLAN over eth1, and Neutron is also configured to use VLANs This has been reproduced several times on different hardware and with different Fuel 9 installations, with different Ubuntu repositories and with the XenServer plugin disabled as well as enabled. MD5 sum of ISO has been confirmed: # md5sum MirantisOpenStack-9.0.iso 07461ba42d5056830dd6f203e8fe9691 MirantisOpenStack-9.0.iso Expected results:  Deployment succeeds :) Actual result:  Deployment fails with the above errors :) Reproducibility:  Deployments are occasionally successful, but once a deployment is successful it is not possible to add a new controller node as adding a controller fails 100% of the time. Workaround:  None known Impact:  Fatal; but also preventing the validation of the XenServer plugin for MOS 9 as this issue also occurs with the plugin installed. Fuel snapshot is attached, with a second snapshot (with the XenServer plugin enabled) at https://citrix.sharefile.com/d-s4a809f3542947818
2016-09-29 08:00:09 Dmitry Teselkin fuel: status Confirmed Incomplete
2016-10-31 13:02:23 Vitaly Sedelnik fuel: status Incomplete Invalid