Galera/MySQL doesn't delete old gcache.page files

Bug #1794514 reported by Alexander Rubtsov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
Critical
Oleksiy Molchanov

Bug Description

MOS: 9.2 (build 603)
Packages:
mysql-client-5.6 5.6.39-0~u14.04+mos0
mysql-common 5.5.59-0ubuntu0.14.04.1
mysql-server-wsrep-5.6 5.6.39-0~u14.04+mos0
mysql-server-wsrep-core-5.6 5.6.33-0~u14.04+mos1
mysql-wsrep-common-5.6 5.6.39-0~u14.04+mos0
mysql-wsrep-libmysqlclient18:amd64 5.6.39-0~u14.04+mos0
galera-3 25.3.10-1~u14.04+mos3
(the following fix is in place: https://review.fuel-infra.org/#/c/28287/)

A lot of gcache.page files have been creating for several days.
None of them were removed. The result is consuming 100% of disk space and the cic node therefore the database stop operating properly.
root@cic-2:/var/lib/mysql# ls -lhrt
total 41G
-rw-rw---- 1 mysql mysql 45 Sep 11 12:48 xtrabackup_galera_info
-rw-rw---- 1 mysql mysql 574 Sep 11 12:48 xtrabackup_info
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 performance_schema
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 cinder
drwx------ 2 mysql mysql 12K Sep 11 12:49 zabbix
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 keystone
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 watchmen
drwx------ 2 mysql mysql 28K Sep 11 12:49 neutron
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 lost+found
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 glance
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 cmha
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 nova_api
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 aodh
drwx------ 2 mysql mysql 4.0K Sep 11 12:49 mysql
drwx------ 2 mysql mysql 16K Sep 11 12:49 nova
-rw-rw---- 1 mysql mysql 56 Sep 11 12:49 auto.cnf
-rw-rw---- 1 mysql mysql 264 Sep 11 12:49 gvwstate.dat
-rw-rw---- 1 mysql mysql 111 Sep 11 12:49 grastate.dat
-rw------- 1 mysql mysql 513M Sep 11 14:37 galera.cache
-rw------- 1 mysql mysql 128M Sep 11 14:55 gcache.page.000000
-rw------- 1 mysql mysql 128M Sep 11 15:25 gcache.page.000001
-rw------- 1 mysql mysql 128M Sep 11 15:53 gcache.page.000002
-rw------- 1 mysql mysql 128M Sep 11 16:13 gcache.page.000003
-rw------- 1 mysql mysql 128M Sep 11 16:44 gcache.page.000004
-rw------- 1 mysql mysql 128M Sep 11 17:02 gcache.page.000005
...omitted...
-rw------- 1 mysql mysql 128M Sep 16 17:01 gcache.page.000300
-rw------- 1 mysql mysql 128M Sep 16 17:29 gcache.page.000301
-rw------- 1 mysql mysql 128M Sep 16 17:30 gcache.page.000302
-rw-rw---- 1 mysql mysql 886M Sep 21 11:47 ib_logfile0
-rw-rw---- 1 mysql mysql 886M Sep 16 16:56 ib_logfile1
-rw-rw---- 1 mysql mysql 204M Sep 21 11:47 ibdata1
-rw------- 1 mysql root 0 Sep 21 11:47 wsrep_recovery.fail

The current settings related to gcache (according to mysql -e "SHOW VARIABLES LIKE 'wsrep_provider_options';"):
gcache.dir = /var/lib/mysql/
gcache.keep_pages_size = 0
gcache.mem_size = 0
gcache.name = /var/lib/mysql//galera.cache
gcache.page_size = 128M
gcache.size = 512M

The log files are from the customer's environment, so I can provide them directly.

Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

sla1 for 9.0-updates

Changed in mos:
importance: Undecided → Critical
assignee: nobody → MOS Maintenance (mos-maintenance)
milestone: none → 9.x-updates
tags: added: customer-found sla1
Changed in mos:
milestone: 9.x-updates → 9.2-mu-9
status: New → Confirmed
Changed in mos:
assignee: MOS Maintenance (mos-maintenance) → Oleksiy Molchanov (omolchanov)
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/fuel-library (9.0/mitaka)

Fix proposed to branch: 9.0/mitaka
Change author: Oleksiy Molchanov <email address hidden>
Review: https://review.fuel-infra.org/39371

Changed in mos:
status: Confirmed → In Progress
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/fuel-library (9.0/mitaka)

Reviewed: https://review.fuel-infra.org/39371
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0/mitaka

Commit: 3d5234c43572a29169ce85a925834c346d45993b
Author: Oleksiy Molchanov <email address hidden>
Date: Tue Oct 9 10:29:09 2018

Set gcache.keep_pages_size to 5120M

Change-Id: I5279b207ba03b54eeb7c9b4e843b7efaf31f6df0
Closes-Bug: 1794514

Changed in mos:
status: In Progress → Fix Committed
Revision history for this message
Mikhail Samoylov (msamoylov) wrote :
Download full text (3.2 KiB)

Verified
[root@nailgun ~]# fuel nodes
id | status | name | cluster | ip | mac | roles | pending_roles | online | group_id
---+----------+------------------+---------+------------+-------------------+-----------------+---------------+--------+---------
 1 | ready | Untitled (bb:48) | 1 | 10.109.0.5 | 12:2d:93:94:e6:80 | controller | | 1 | 1
 2 | ready | Untitled (39:57) | 1 | 10.109.0.4 | 02:76:1b:18:ff:d9 | cinder, compute | | 1 | 1
 3 | discover | Untitled (93:b2) | | 10.109.0.3 | 64:71:7a:95:93:b2 | | | 1 |
[root@nailgun ~]# fuel fuel-version
api: '1'
auth_required: true
feature_groups: []
openstack_version: mitaka-9.0
release: '9.2'

[root@nailgun ~]# ssh node-1
Warning: Permanently added 'node-1' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.5 LTS (GNU/Linux 4.4.0-139-generic x86_64)

 * Documentation: https://help.ubuntu.com/
Last login: Tue Nov 20 14:38:09 2018 from 10.109.0.2
root@node-1:~# grep -r wsrep_provider_options /etc/puppet/modules/osnailyfacter/manifests/database/database.pp
    $wsrep_provider_options = "\"gcache.size=${galera_gcache_size}; gcache.keep_pages_size=${galera_gcache_keep_pages_size}; gmcast.listen_addr=tcp://${galera_node_address}:${wsrep_group_comm_port}\""
        'wsrep_provider_options' => $wsrep_provider_options,
root@node-1:~# mysql -e "SHOW VARIABLES LIKE 'wsrep_provider_options';" | grep "keep_pages_size"
wsrep_provider_options base_dir = /var/lib/mysql/; base_host = 10.109.1.6; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.causal_keepalive_period = PT1S; evs.debug_log_mask = 0x1; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.info_log_mask = 0; evs.install_timeout = PT7.5S; evs.join_retrans_period = PT1S; evs.keepalive_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.use_aggregate = true; evs.user_send_window = 2; evs.version = 0; evs.view_forget_timeout = P1D; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 5120M; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 512M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://10.109.1.6:4567; gmcast.mcast_addr = ; gmcast.mcast_ttl = 1; gmcast.peer_timeout = PT3S; gmcast.segment = 0; gmcast.time_wait = PT5S; gmcast.version = 0; ist.recv_addr = 10.109.1.6; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.linger = PT20S; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8;...

Read more...

Changed in mos:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.