Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

Bug #1560206
Activity log

Activity log for bug #1560206

Date	Who	What changed	Old value	New value	Message
2016-03-21 20:33:41	Brad House	bug			added bug
2016-03-21 20:33:41	Brad House	attachment added		Test case to reproduce failure https://bugs.launchpad.net/bugs/1560206/+attachment/4606801/+files/galera_test.c
2016-03-22 13:32:52	Brad House	attachment removed	Test case to reproduce failure https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1560206/+attachment/4606801/+files/galera_test.c
2016-03-22 13:33:34	Brad House	attachment added		Test case to reproduce failure (updated for correctness) https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1560206/+attachment/4607468/+files/galera_test.c
2016-03-22 13:49:15	Brad House	description	My sequence of events is basically identical to this blog post, because this is directly from CoderShip, I am assuming this is intended to be supported: http://galeracluster.com/2015/09/support-for-mysql-transaction-isolation-levels-in-galera-cluster/ However, what is being experienced is 1) Data inconsistency, dirty reads are occurring so the record is not being updated properly (lost updates) 2) Connection lockup occurs, where the only way to unlock the client is the restart the DB node(s) for the locked connections. The server version being used is Percona-XtraDB-Cluster-server-56-5.6.28-25.14.1.el7.x86_64 on CentOS 7.2, it is a 3-node cluster running over a local LAN connected via dual 1Gbps links, with a Linux IPVS load balancer doing round-robin in front from my application. I have attached a test case that reproduces this issue consistently. This same test case works fine if pointing to only a single DB node in the cluster. Config settings: /etc/my.cnf: [mysqld] datadir = /var/lib/mysql # move tmpdir due to /tmp being a memory backed tmpfs filesystem, mysql uses this for on disk sorting tmpdir = /var/lib/mysql/tmp [mysqld_safe] pid-file = /run/mysqld/mysql.pid syslog !includedir /etc/my.cnf.d /etc/my.cnf.d/base.cnf: [mysqld] bind-address = 0.0.0.0 key_buffer = 256M max_allowed_packet = 16M max_connections = 256 # Some optimizations thread_concurrency = 10 sort_buffer_size = 2M query_cache_limit = 100M query_cache_size = 256M log_bin binlog_format = ROW gtid_mode = ON log_slave_updates enforce_gtid_consistency = 1 group_concat_max_len = 102400 innodb_buffer_pool_size = 10G innodb_log_file_size = 64M innodb_file_per_table = 1 innodb_file_format = barracuda default_storage_engine = innodb # SSD Tuning innodb_flush_neighbors = 0 innodb_io_capacity = 6000 /etc/my.cnf.d/cluster.cnf: # Galera cluster [mysqld] wsrep_provider = /usr/lib64/libgalera_smm.so wsrep_sst_method = xtrabackup-v2 wsrep_sst_auth = "sstuser:s3cretPass" wsrep_cluster_name = cluster wsrep_slave_threads = 32 wsrep_max_ws_size = 2G wsrep_provider_options = "gcache.size = 5G; pc.recovery = true" wsrep_cluster_address = gcomm://10.30.30.11,10.30.30.12,10.30.30.13 wsrep_sync_wait = 0 innodb_autoinc_lock_mode = 2 innodb_locks_unsafe_for_binlog = 1 innodb_flush_log_at_trx_commit = 0 sync_binlog = 0 innodb_support_xa = 0 innodb_flush_method = ALL_O_DIRECT [sst] progress = 1 time = 1 streamfmt = xbstream	My sequence of events is basically identical to this blog post (Transaction with SELECT FOR UPDATE, math performed on a record, then record updated and committed), because this is directly from CoderShip, I am assuming this is intended to be supported: http://galeracluster.com/2015/09/support-for-mysql-transaction-isolation-levels-in-galera-cluster/ However, what is being experienced is 1) Data inconsistency, dirty reads are occurring, so the calculation for the update is wrong, and the duplicate updates from different nodes aren't causing deadlocks so we end up with inconsistency due to these lost updates. 2) Connection lockup occurs, where the only way to unlock the client is the restart the DB node(s) for the locked connections. The server version being used is Percona-XtraDB-Cluster-server-56-5.6.28-25.14.1.el7.x86_64 on CentOS 7.2, it is a 3-node cluster running over a local LAN connected via dual 1Gbps links, with a Linux IPVS load balancer doing round-robin in front from my application. I have attached a test case that reproduces this issue consistently. This same test case works fine if pointing to only a single DB node in the cluster. Config settings: /etc/my.cnf: [mysqld] datadir = /var/lib/mysql # move tmpdir due to /tmp being a memory backed tmpfs filesystem, mysql uses this for on disk sorting tmpdir = /var/lib/mysql/tmp [mysqld_safe] pid-file = /run/mysqld/mysql.pid syslog !includedir /etc/my.cnf.d /etc/my.cnf.d/base.cnf: [mysqld] bind-address = 0.0.0.0 key_buffer = 256M max_allowed_packet = 16M max_connections = 256 # Some optimizations thread_concurrency = 10 sort_buffer_size = 2M query_cache_limit = 100M query_cache_size = 256M log_bin binlog_format = ROW gtid_mode = ON log_slave_updates enforce_gtid_consistency = 1 group_concat_max_len = 102400 innodb_buffer_pool_size = 10G innodb_log_file_size = 64M innodb_file_per_table = 1 innodb_file_format = barracuda default_storage_engine = innodb # SSD Tuning innodb_flush_neighbors = 0 innodb_io_capacity = 6000 /etc/my.cnf.d/cluster.cnf: # Galera cluster [mysqld] wsrep_provider = /usr/lib64/libgalera_smm.so wsrep_sst_method = xtrabackup-v2 wsrep_sst_auth = "sstuser:s3cretPass" wsrep_cluster_name = cluster wsrep_slave_threads = 32 wsrep_max_ws_size = 2G wsrep_provider_options = "gcache.size = 5G; pc.recovery = true" wsrep_cluster_address = gcomm://10.30.30.11,10.30.30.12,10.30.30.13 wsrep_sync_wait = 0 innodb_autoinc_lock_mode = 2 innodb_locks_unsafe_for_binlog = 1 innodb_flush_log_at_trx_commit = 0 sync_binlog = 0 innodb_support_xa = 0 innodb_flush_method = ALL_O_DIRECT [sst] progress = 1 time = 1 streamfmt = xbstream
2016-03-22 13:50:59	Brad House	description	My sequence of events is basically identical to this blog post (Transaction with SELECT FOR UPDATE, math performed on a record, then record updated and committed), because this is directly from CoderShip, I am assuming this is intended to be supported: http://galeracluster.com/2015/09/support-for-mysql-transaction-isolation-levels-in-galera-cluster/ However, what is being experienced is 1) Data inconsistency, dirty reads are occurring, so the calculation for the update is wrong, and the duplicate updates from different nodes aren't causing deadlocks so we end up with inconsistency due to these lost updates. 2) Connection lockup occurs, where the only way to unlock the client is the restart the DB node(s) for the locked connections. The server version being used is Percona-XtraDB-Cluster-server-56-5.6.28-25.14.1.el7.x86_64 on CentOS 7.2, it is a 3-node cluster running over a local LAN connected via dual 1Gbps links, with a Linux IPVS load balancer doing round-robin in front from my application. I have attached a test case that reproduces this issue consistently. This same test case works fine if pointing to only a single DB node in the cluster. Config settings: /etc/my.cnf: [mysqld] datadir = /var/lib/mysql # move tmpdir due to /tmp being a memory backed tmpfs filesystem, mysql uses this for on disk sorting tmpdir = /var/lib/mysql/tmp [mysqld_safe] pid-file = /run/mysqld/mysql.pid syslog !includedir /etc/my.cnf.d /etc/my.cnf.d/base.cnf: [mysqld] bind-address = 0.0.0.0 key_buffer = 256M max_allowed_packet = 16M max_connections = 256 # Some optimizations thread_concurrency = 10 sort_buffer_size = 2M query_cache_limit = 100M query_cache_size = 256M log_bin binlog_format = ROW gtid_mode = ON log_slave_updates enforce_gtid_consistency = 1 group_concat_max_len = 102400 innodb_buffer_pool_size = 10G innodb_log_file_size = 64M innodb_file_per_table = 1 innodb_file_format = barracuda default_storage_engine = innodb # SSD Tuning innodb_flush_neighbors = 0 innodb_io_capacity = 6000 /etc/my.cnf.d/cluster.cnf: # Galera cluster [mysqld] wsrep_provider = /usr/lib64/libgalera_smm.so wsrep_sst_method = xtrabackup-v2 wsrep_sst_auth = "sstuser:s3cretPass" wsrep_cluster_name = cluster wsrep_slave_threads = 32 wsrep_max_ws_size = 2G wsrep_provider_options = "gcache.size = 5G; pc.recovery = true" wsrep_cluster_address = gcomm://10.30.30.11,10.30.30.12,10.30.30.13 wsrep_sync_wait = 0 innodb_autoinc_lock_mode = 2 innodb_locks_unsafe_for_binlog = 1 innodb_flush_log_at_trx_commit = 0 sync_binlog = 0 innodb_support_xa = 0 innodb_flush_method = ALL_O_DIRECT [sst] progress = 1 time = 1 streamfmt = xbstream	My sequence of events is basically identical to this blog post (Transaction with SELECT FOR UPDATE, math performed on a record, then record updated and committed), because this is directly from CoderShip, I am assuming this is intended to be supported: http://galeracluster.com/2015/09/support-for-mysql-transaction-isolation-levels-in-galera-cluster/ However, what is being experienced is 1) Data inconsistency, dirty reads are occurring, so the calculation for the update is wrong, and the duplicate updates from different nodes aren't causing deadlocks so we end up with inconsistency due to these lost updates. 2) Connection lockup occurs, where the only way to unlock the client is the restart the DB node(s) for the locked connections. When performing a "SHOW PROCESSLIST;" it shows all connections from the application are in a Sleep state, however they did NOT receive responses. The server version being used is Percona-XtraDB-Cluster-server-56-5.6.28-25.14.1.el7.x86_64 on CentOS 7.2, it is a 3-node cluster running over a local LAN connected via dual 1Gbps links, with a Linux IPVS load balancer doing round-robin in front from my application. I have attached a test case that reproduces this issue consistently. This same test case works fine if pointing to only a single DB node in the cluster. Config settings: /etc/my.cnf: [mysqld] datadir = /var/lib/mysql # move tmpdir due to /tmp being a memory backed tmpfs filesystem, mysql uses this for on disk sorting tmpdir = /var/lib/mysql/tmp [mysqld_safe] pid-file = /run/mysqld/mysql.pid syslog !includedir /etc/my.cnf.d /etc/my.cnf.d/base.cnf: [mysqld] bind-address = 0.0.0.0 key_buffer = 256M max_allowed_packet = 16M max_connections = 256 # Some optimizations thread_concurrency = 10 sort_buffer_size = 2M query_cache_limit = 100M query_cache_size = 256M log_bin binlog_format = ROW gtid_mode = ON log_slave_updates enforce_gtid_consistency = 1 group_concat_max_len = 102400 innodb_buffer_pool_size = 10G innodb_log_file_size = 64M innodb_file_per_table = 1 innodb_file_format = barracuda default_storage_engine = innodb # SSD Tuning innodb_flush_neighbors = 0 innodb_io_capacity = 6000 /etc/my.cnf.d/cluster.cnf: # Galera cluster [mysqld] wsrep_provider = /usr/lib64/libgalera_smm.so wsrep_sst_method = xtrabackup-v2 wsrep_sst_auth = "sstuser:s3cretPass" wsrep_cluster_name = cluster wsrep_slave_threads = 32 wsrep_max_ws_size = 2G wsrep_provider_options = "gcache.size = 5G; pc.recovery = true" wsrep_cluster_address = gcomm://10.30.30.11,10.30.30.12,10.30.30.13 wsrep_sync_wait = 0 innodb_autoinc_lock_mode = 2 innodb_locks_unsafe_for_binlog = 1 innodb_flush_log_at_trx_commit = 0 sync_binlog = 0 innodb_support_xa = 0 innodb_flush_method = ALL_O_DIRECT [sst] progress = 1 time = 1 streamfmt = xbstream
2016-03-22 17:54:36	Brad House	attachment removed	Test case to reproduce failure (updated for correctness) https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1560206/+attachment/4607468/+files/galera_test.c
2016-03-22 17:55:03	Brad House	attachment added		Test case to reproduce failure (updated to allow rollback on prepare) https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1560206/+attachment/4607697/+files/galera_test.c
2016-03-23 20:34:12	Brad House	attachment removed	Test case to reproduce failure (updated to allow rollback on prepare) https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1560206/+attachment/4607697/+files/galera_test.c
2016-03-23 20:34:52	Brad House	attachment added		Test case to reproduce failure (allow to specify list of servers bypassing load balancer) https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1560206/+attachment/4608968/+files/galera_test.c
2016-03-24 03:07:09	Krunal Bauskar	percona-xtradb-cluster: assignee		Kenn Takara (kenn-takara)
2017-05-15 01:57:50	Kenn Takara	nominated for series		percona-xtradb-cluster/5.6
2017-05-15 01:57:50	Kenn Takara	bug task added		percona-xtradb-cluster/5.6
2017-05-15 01:57:50	Kenn Takara	nominated for series		percona-xtradb-cluster/5.7
2017-05-15 01:57:50	Kenn Takara	bug task added		percona-xtradb-cluster/5.7
2017-05-15 01:58:07	Kenn Takara	percona-xtradb-cluster/5.7: assignee		Kenn Takara (kenn-takara)
2017-05-15 01:58:15	Kenn Takara	percona-xtradb-cluster/5.7: status	New	Fix Released
2017-05-15 02:12:36	Kenn Takara	percona-xtradb-cluster/5.6: status	New	Confirmed