mysql service holds deleted log files

Bug #2051987 reported by Anna Savchenko
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MySQL InnoDB Cluster Charm
Triaged
High
Unassigned

Bug Description

I've observed that two very large files were held by mysql process causing the disk to be 95% utilized.

ubuntu@ubuntu-infra-13-01:~$ juju ssh mysql-innodb-cluster/0
Last login: Thu Feb 1 17:30:08 2024 from 10.131.8.4

ubuntu@juju-c3cb23-0-lxd-8:~$ sudo lsof +L1
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
      Output information may be incomplete.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
unattende 332 root 3w REG 8,34 113 0 301747 /var/log/unattended-upgrades/unattended-upgrades-shutdown.log.1 (deleted)
mysqld 348 mysql 1w REG 8,34 104091260620 0 1750078 /var/log/mysql/error.log.1 (deleted)
mysqld 348 mysql 2w REG 8,34 104091260620 0 1750078 /var/log/mysql/error.log.1 (deleted)
mysqld 348 mysql 6u REG 8,34 0 0 134193 /tmp/#134193 (deleted)
mysqld 348 mysql 7u REG 8,34 142 0 134226 /tmp/#134226 (deleted)
mysqld 348 mysql 8u REG 8,34 64710 0 136080 /tmp/#136080 (deleted)
mysqld 348 mysql 13u REG 8,34 2174 0 136916 /tmp/#136916 (deleted)
landscape 1118 root 6w REG 8,34 361 0 304942 /var/log/landscape/watchdog.log.1 (deleted)
landscape 1120 landscape 6w REG 8,34 2207957 0 305187 /var/log/landscape/broker.log.1 (deleted)
landscape 1121 landscape 6w REG 8,34 1690223 0 305186 /var/log/landscape/monitor.log.1 (deleted)
landscape 1122 root 6w REG 8,34 536214 0 305185 /var/log/landscape/manager.log.1 (deleted)

Two /var/log/mysql/error.log.1 files 104091260620 bytes each.

After restarting mysql service, the files got released and disk utilization went down.

ubuntu@juju-c3cb23-0-lxd-8:~$ sudo systemctl restart mysql.service

ubuntu@juju-c3cb23-0-lxd-8:~$ sudo lsof +L1
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
      Output information may be incomplete.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
unattende 332 root 3w REG 8,34 113 0 301747 /var/log/unattended-upgrades/unattended-upgrades-shutdown.log.1 (deleted)
landscape 1118 root 6w REG 8,34 361 0 304942 /var/log/landscape/watchdog.log.1 (deleted)
landscape 1120 landscape 6w REG 8,34 2207957 0 305187 /var/log/landscape/broker.log.1 (deleted)
landscape 1121 landscape 6w REG 8,34 1690223 0 305186 /var/log/landscape/monitor.log.1 (deleted)
landscape 1122 root 6w REG 8,34 536214 0 305185 /var/log/landscape/manager.log.1 (deleted)
mysqld 1773156 mysql 6u REG 8,34 0 0 134193 /tmp/#134193 (deleted)
mysqld 1773156 mysql 7u REG 8,34 0 0 134226 /tmp/#134226 (deleted)
mysqld 1773156 mysql 8u REG 8,34 0 0 136080 /tmp/#136080 (deleted)
mysqld 1773156 mysql 13u REG 8,34 0 0 136916 /tmp/#136916 (deleted)

ubuntu@juju-c3cb23-0-lxd-8:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdc2 229G 108G 109G 50% /
none 492K 4.0K 488K 1% /dev
tmpfs 100K 0 100K 0% /dev/lxd
tmpfs 100K 0 100K 0% /dev/.lxd-mounts
tmpfs 189G 0 189G 0% /dev/shm
tmpfs 76G 328K 76G 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
snapfuse 56M 56M 0 100% /snap/core18/2812
snapfuse 75M 75M 0 100% /snap/core22/1033
snapfuse 64M 64M 0 100% /snap/core20/1891
snapfuse 112M 112M 0 100% /snap/lxd/24322
snapfuse 94M 94M 0 100% /snap/mysql-shell/29
snapfuse 6.2M 6.2M 0 100% /snap/prometheus-grok-exporter/1
snapfuse 41M 41M 0 100% /snap/snapd/20290
snapfuse 64M 64M 0 100% /snap/core20/2105
snapfuse 41M 41M 0 100% /snap/snapd/20671
snapfuse 115M 115M 0 100% /snap/lxd/26741
snapfuse 270M 270M 0 100% /snap/grafana-agent/16
tmpfs 38G 4.0K 38G 1% /run/user/1000

Ubuntu 22.04.3 LTS
Openstack Yoga
Juju 3.1.6
LXD 5.0.2
MAAS 3.2.10

How to reproduce: monitor disk utilization and the output of "sudo lsof +L1"

Noticed this error in the journalctl output: "Feb 01 20:17:01 juju-c3cb23-0-lxd-2 sudo[356539]: root : command not allowed ; PWD=/ ; USER=root ; COMMAND=/usr/bin/juju-run logrotated/13 '/var/lib/juju/agents/unit-logrotated-13/.venv/bin/python3 /var/lib/juju/agents/unit-logrotated-13/charm/lib/lib_cron.py'"

The error was repeated many times.

Revision history for this message
Anna Savchenko (annsavchenko) wrote :

Found this in journalctl output:

Feb 02 00:00:09 juju-c3cb23-7-lxd-8 logrotate[1754765]: error: error running shared postrotate script for '/var/log/mysql.log /var/log/mysql/*log '
Feb 02 00:00:09 juju-c3cb23-7-lxd-8 systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE
Feb 02 00:00:09 juju-c3cb23-7-lxd-8 systemd[1]: logrotate.service: Failed with result 'exit-code'.
Feb 02 00:00:09 juju-c3cb23-7-lxd-8 systemd[1]: Failed to start Rotate log files.

When I try to execute mysql-server job (/etc/logrotate.d/mysql-server), I get this error:

root@juju-c3cb23-7-lxd-8:/etc/logrotate.d# MYADMIN="/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf"

root@juju-c3cb23-7-lxd-8:/etc/logrotate.d# $MYADMIN ping
mysqladmin: connect to server at 'localhost' failed

Revision history for this message
Anna Savchenko (annsavchenko) wrote :

root@juju-c3cb23-7-lxd-8:/etc/logrotate.d# $MYADMIN ping
mysqladmin: connect to server at 'localhost' failed
error: 'Access denied for user 'debian-sys-maint'@'localhost' (using password: YES)'

Revision history for this message
Anna Savchenko (annsavchenko) wrote :

subscribed ~field-high

Revision history for this message
Anna Savchenko (annsavchenko) wrote :

Subscribed Field High as this might affect the customer at any point.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Anna, can you have a look and see if the 'debian-sys-maint' user exists in the database (mysql table). Also you could check if the debian-sys-maint user can login with the password in /etc/mysql/debian.cnf

The mysql-server-9.0 package's postinst should create the user in the DB and the file in /etc/mysql/debian.cnf. If the user is missing, then perhaps it never got created (packaging bug), or the user has been removed at some point in the installation (charm bug? -- but I can't find any evidence), or some locking down/security process that's been done to the mysql instances?

Changed in charm-mysql-innodb-cluster:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for MySQL InnoDB Cluster Charm because there has been no activity for 60 days.]

Changed in charm-mysql-innodb-cluster:
status: Incomplete → Expired
Revision history for this message
Anna Savchenko (annsavchenko) wrote :

Hey Alex, apologies for the late response!

I tried to login to mysql with the creds in /etc/mysql/debian.cnf and got "access denied"

$ sudo mysql -u debian-sys-maint -p
Enter password:
ERROR 1045 (28000): Access denied for user 'debian-sys-maint'@'localhost' (using password: YES)

I tried to login with the mysql user in order to check the database but got "access denied" as well.

Changed in charm-mysql-innodb-cluster:
status: Expired → Incomplete
Revision history for this message
Victor Vavan (victorvavan) wrote (last edit ):

Hi all,

I'm dealing with a case that is suffering from this bug.

$ sudo lsof +L1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
mysqld 328 mysql 1w REG 8,3 99624508241 0 23331627 /var/log/mysql/error.log.1 (deleted)
mysqld 328 mysql 2w REG 8,3 99624508241 0 23331627 /var/log/mysql/error.log.1 (deleted)
mysqld 328 mysql 6u REG 8,3 0 0 21794716 /tmp/#21794716 (deleted)
mysqld 328 mysql 7u REG 8,3 142 0 21794718 /tmp/#21794718 (deleted)
mysqld 328 mysql 8u REG 8,3 0 0 21794720 /tmp/#21794720 (deleted)
mysqld 328 mysql 13u REG 8,3 2175 0 21794755 /tmp/#21794755 (deleted)
filebeat 572 root 18r REG 8,3 99624508701 0 23331627 /var/log/mysql/error.log.1 (deleted)
landscape 575 root 6w REG 8,3 353 0 21794792 /var/log/landscape/watchdog.log.1 (deleted)
landscape 842 landscape 6w REG 8,3 2166444 0 21784887 /var/log/landscape/broker.log.1 (deleted)
landscape 843 landscape 6w REG 8,3 2737262 0 21794803 /var/log/landscape/monitor.log.1 (deleted)
landscape 844 root 6w REG 8,3 600319 0 21794805 /var/log/landscape/manager.log.1 (deleted)

# /usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping
mysqladmin: connect to server at 'localhost' failed
error: 'Access denied for user 'debian-sys-maint'@'localhost' (using password: YES)'

Workaround: restart MySQL application manually.

The username/password combination in /etc/mysql/debian.cnf for user debian-sys-maint do NOT work.

We logged into MySQL:

juju exec --unit mysql/leader leader-get mysql.passwd
juju ssh mysql/leader
sudo mysql -u root -p"<password>"

The user is present in the DB:

+-----------+------------------+
| Host | User |
+-----------+------------------+
| localhost | debian-sys-maint |
+-----------+------------------+

Ubuntu Focal
Openstack Yoga
Juju 2.9.43
mysql-innodb-cluster 8.0/stable

I changed the password for user debian-sys-maint to reflect what is in /etc/mysql/debian.cnf

ALTER user 'debian-sys-maint'@'localhost' IDENTIFIED by 'LMfSBaA9l2z4EdjD1';

I could then log in as the user and the section that was failing in /etc/logrotate.d/mysql-server succeeds:

sudo /usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping
mysqld is alive

The changes persist through reboot of the MySQL units.

I'm not sure what triggered the password change in debian.cnf (or MySQL?) however.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for MySQL InnoDB Cluster Charm because there has been no activity for 60 days.]

Changed in charm-mysql-innodb-cluster:
status: Incomplete → Expired
Changed in charm-mysql-innodb-cluster:
status: Expired → Triaged
importance: Undecided → High
Revision history for this message
Victor Vavan (victorvavan) wrote :

An update to my previous comment,

All units in the cluster will have different passwords set in /etc/mysql/debian.cnf, and only one of the units will have the correct password set for the debian-sys-maint user account in MySQL.

This issue only occurs when the unit that has the matching password is NOT the same as the primary R/W unit.

This behavior has been observed across multiple clouds.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

I suspect what may be going on here is that the mysql.user db is being replicated in the cluster, and thus the main DB is being synced to the followers which then overwrites the mysql.user "debian-sys-maint" password. This is because the cluster is setup with group replication, and MySQL 8.0 replicates MyIASM tables which includes the mysql db.

I think that comments #8 and #10 confirm this behaviour.

In order to fix this, I imagine that a solution would be for the charm to record the leader's debian-sys-maint password in the leader/cluster storage, this would be replicated between the charms, and then for the charm to write the debian-sys-maint password into /etc/mysql/debian.cnf and simultaneously ensure that the password is set that way. This would then mean that every unit would have the same debian-sys-maint password configured.

I don't think this raises an additional security hole, as the root password is already in the relation data.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.