Excessive Memory usage

Bug #1078759 reported by Kari Lehtinen on 2012-11-14
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Galera
Low
Alex Yurchenko
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Confirmed
Medium
Unassigned
5.6
New
Medium
Unassigned

Bug Description

I'm evaluating Percona XtraDB cluster in small 2 Node environment. Replication between nodes is working fine, but seems that even under light utilization mysqld process size grows all the time without releasing the memory compared to Standalone Server.

OS: Centos 6.3 x64

With following packages installed:

Percona-Server-shared-51-5.1.66-rel14.1.495.rhel6.x86_64
percona-release-0.0-1.x86_64
percona-xtrabackup-2.0.3-470.rhel6.x86_64
Percona-XtraDB-Cluster-client-5.5.27-23.6.356.rhel6.x86_64
Percona-XtraDB-Cluster-server-5.5.27-23.6.356.rhel6.x86_64

Updated Galera to the latest available:
galera-23.2.2-1.rhel5.x86_64

Node 1
my.cnf:

[mysqld_safe]
wsrep_urls=gcomm://10.0.0.101:4567,gcomm://10.0.0.102:4567,gcomm://

[mysqld]
datadir=/var/lib/mysql/mysql_data
user=mysql
log_slave_updates = 1
binlog_format=ROW
max_allowed_packet = 200M
default_storage_engine=InnoDB

#wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_slave_threads=2
wsrep_cluster_name=mycluster
wsrep_sst_method=xtrabackup
wsrep_sst_auth=root:
wsrep_node_name=test-web01

innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
innodb_buffer_pool_size = 256M
innodb_additional_mem_pool_size = 4M
innodb_log_file_size = 64M
innodb_log_buffer_size = 8M
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 1

Running following test:

create database ptest;
use ptest;
create table ti2(c1 int auto_increment primary key, c2 char(255)) engine=InnoDB;
insert into ti2(c2) values('abc');
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;
insert into ti2(c2) select c2 from ti2;

Results of Percona-XtraDB-Cluster-server-5.5.27

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
  8951 mysql 20 0 1101m 83m 7160 S 0.0 5.1 0:00.29 mysqld

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 8951 mysql 20 0 1741m 846m 2508 S 0.0 51.1 1:19.67 mysqld

Results of Percona-Server-server-55-5.5.28-rel29.1.335

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 7077 mysql 20 0 756m 55m 5540 S 0.0 5.5 0:00.12 mysqld

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 7077 mysql 20 0 756m 308m 5788 S 0.3 31.0 0:25.45 mysqld

When inserts are continued Galera cluster will eventually run out memory but with Standalone Server memory usage is not growing. Comparison is done against different Mysql versions, but I see similar result when I disable replication by removing wsrep parameters from XtraDB Cluster my.cnf.

Related branches

Bernhard Schmidt (berni) wrote :

I am seeing the same problem with 5.5.28-23.7-369.squeeze on Debian Squeeze on a 3-node Galera cluster.

Each node has 1GB of RAM, which should be plenty, given that a plain text dump of the data in question is only 150M big. There are a few tables with some hundred thousand entries in it which are mostly appended and seldomly read (mail logs). Performance is just fine. However when I start a cleaning job that just removes 100000 entries from a table the memory usage on all nodes goes through the roof. It usually is enough to push them deep into swapping which makes recovery a mess.

[mysqld]
datadir=/var/lib/mysql
binlog_format=ROW

thread_cache_size=4
query_cache_size=8M

wsrep_provider=/usr/lib64/libgalera_smm.so

wsrep_slave_threads=4
wsrep_cluster_name=something
wsrep_sst_method=xtrabackup

innodb_buffer_pool_size=128M
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
innodb_flush_method=O_DIRECT
innodb_file_per_table

Bernhard Schmidt (berni) wrote :

From https://mariadb.atlassian.net/browse/MDEV-3848

This test runs larger and larger transactions and in the end the size of the transaction is ~1M rows. Galera does not currently support arbitarily large transactions. There are configuration variables 'wsrep_max_ws_rows' and 'wsrep_max_ws_size' to abort huge transaction before OOM would happen.

Huge transaction support is in design phase and will be part of some future release.

Mark Rose (markrose) wrote :

Are you using temporary tables? If so, this may be a duplicate of Bug #1112514.

Bernhard Schmidt (berni) wrote :

No temporary tables in use here

As per my discussion with galera, the fix for transaction
fragmentation should be available in future galera releases, and
the optimum bounds for wsrep_max_ws_size and wsrep_max_ws_rows
are being implemented.

Changed in percona-xtradb-cluster:
status: New → Confirmed
no longer affects: codership-mysql
Alex Yurchenko (ayurchen) wrote :

This is a work in progress. With the same test current 3.x tree shows:
3.1 wsrep OFF
Rows Seconds RSS
1 0 82116
2 0 82116
4 0 82116
8 0 82116
16 0 82116
32 0 82116
64 0 82116
128 0 82376
256 0 82376
512 0 82640
1024 0 83232
2048 0 83860
4096 0 85116
8192 0 87600
16384 0 92616
32768 1 102664
65536 0 122724
131072 2 163204
262144 2 243008
524288 5 342216
1048576 11 342216

3.1 wsrep ON
Rows Seconds RSS
1 0 82428
2 0 82428
4 0 82428
8 0 82596
16 0 82624
32 0 82704
64 0 82704
128 0 82704
256 0 82964
512 0 83228
1024 0 83856
2048 0 84604
4096 0 86912
8192 0 91548
16384 0 100100
32768 1 117856
65536 1 140960
131072 2 199612
262144 4 317012
524288 9 484000
1048576 18 617764

This effectively halves memory overhead compared to 2.x branch.

Changed in galera:
assignee: nobody → Alex Yurchenko (ayurchen)
importance: Undecided → Medium
status: New → In Progress
importance: Medium → Low

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1044

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers