Comment 0 for bug 1970737

Revision history for this message
StorPool Storage (storpool) wrote :

---Problem Description---
In a virtual machine, during MySQL performance tests with sysbench, IO operations freeze, and the virtual disk does not respond. The data of MySQL is on a virtual drive, backed by a host's local NVMe, attached to VM as a raw virtio-block device. The test runs smoothly for a few minutes. After a while, the IO operations freeze, and any attempt to read or write to the virtual drive remains to wait. Also, after the problem occurs, every read operation to the affected drive (e.g. ls, cat, etc.) stays waiting forever.

---Host Hardware---
CPU: AMD EPYC 7302P 16-Core Processor (32 threads)
RAM: 128 GB
OS Drive: Toshiba KXG60ZNV256G M.2 NVMe PCI-E SSD (256 GB)
Data Drive: Samsung PM983 MZQLB960HAJR-00007 U.2 (960 GB)

---Host Software---
OS: Ubuntu 22.04 LTS
Kernel: 5.15.0-27-generic
Qemu: 1:6.2+dfsg-2ubuntu6
Libvirt: 8.0.0-1ubuntu7

---VM Hardware---
vCPU: <vcpu placement='static'>8</vcpu>
CPU Mode: <cpu mode='host-passthrough' check='none' migratable='on'/>
RAM: 64 GB
OS Type: <type arch='x86_64' machine='pc-q35-6.2'>hvm</type>
OS Drive (64 GB):
  <disk type='file' device='disk'>
    <driver name='qemu' type='qcow2' cache='none' io='native' discard='unmap'/>
    <target dev='vda' bus='virtio'/>
Block Data Drive:
  <disk type="block" device="disk">
    <driver name="qemu" type="raw" cache="none" io="native" discard="unmap"/>
    <target dev="vdb" bus="virtio"/>

---VM Software & Configuration---
OS: Ubuntu 22.04 LTS (minimized)
Kernel: 5.15.0-27-generic
Swap: disabled
OS Drive: /dev/vda2; file-system: ext4; mount-options: defaults; mount-point: /
Data Drive: /dev/vdb
MySQL: 8.0.28-0ubuntu4
Sysbench: 1.0.20+ds-2

---Prepare the VM---

1. Install Ubuntu 22.04 LTS (minimized) as VM OS
2. Boot the VM & log-in as root
3. apt-get install mysql-server mysql-common sysbench apparmor-utils
4. systemctl disable --now mysql.service
5. aa-complain /usr/sbin/mysqld
6. systemctl restart apparmor

---Reproduction---
1. Reboot the VM & log-in as root
2. mkdir -p /data
3. mkfs.ext4 /dev/vdb
4. mount /dev/vdb /data
5. mkdir /data/mysql
6. mkdir /var/run/mysqld
7. /usr/sbin/mysqld --no-defaults --datadir=/data/mysql --lc-messages-dir=/usr/share/mysql/english --log-error --max_connections=256 --socket=/var/run/mysqld/mysqld.sock --table_open_cache=512 --tmpdir=/var/tmp --innodb_buffer_pool_size=1024M --innodb_data_file_path=ibdata1:32M:autoextend --innodb_data_home_dir=/data/mysql --innodb_doublewrite=0 --innodb_flush_log_at_trx_commit=1 --innodb_flush_method=O_DIRECT --innodb_lock_wait_timeout=50 --innodb_log_buffer_size=16M --innodb_log_file_size=256M --innodb_log_group_home_dir=/data/mysql --innodb_max_dirty_pages_pct=80 --innodb_thread_concurrency=0 --user=root --initialize-insecure
8. /usr/sbin/mysqld --no-defaults --datadir=/data/mysql --lc-messages-dir=/usr/share/mysql/english --log-error --max_connections=256 --socket=/var/run/mysqld/mysqld.sock --table_open_cache=512 --tmpdir=/var/tmp --innodb_buffer_pool_size=1024M --innodb_data_file_path=ibdata1:32M:autoextend --innodb_data_home_dir=/data/mysql --innodb_doublewrite=0 --innodb_flush_log_at_trx_commit=1 --innodb_flush_method=O_DIRECT --innodb_lock_wait_timeout=50 --innodb_log_buffer_size=16M --innodb_log_file_size=256M --innodb_log_group_home_dir=/data/mysql --innodb_max_dirty_pages_pct=80 --innodb_thread_concurrency=0 --user=root &
9. echo 'status' | mysql -uroot # verify that MySQL server is up)
10. echo 'drop database test1m' | mysql -uroot
11. echo 'create database test1m' | mysql -uroot
12. /usr/share/sysbench/oltp_read_write.lua --threads=10 --table-size=20000000 --events=0 --time=900 --mysql-user=root --tables=10 --delete_inserts=10 --index_updates=10 --non_index_updates=10 --db-ps-mode=disable --report-interval=1 --db-driver=mysql --mysql-db=test1m --max-requests=0 --rand-seed=303 prepare
/usr/share/sysbench/oltp_read_write.lua --threads=6 --table-size=20000000 --events=0 --time=900 --mysql-user=root --tables=10 --delete_inserts=10 --index_updates=10 --non_index_updates=10 --db-ps-mode=disable --report-interval=1 --db-driver=mysql --mysql-db=test1m --max-requests=0 --rand-seed=303 run

---Resulting Output---
...
[ 620s ] thds: 6 tps: 327.00 qps: 18348.00 (r/w/o: 4578.00/13116.00/654.00) lat (ms,95%): 30.81 err/s: 0.00 reconn/s: 0.00
[ 621s ] thds: 6 tps: 320.00 qps: 17930.85 (r/w/o: 4479.96/12810.89/639.99) lat (ms,95%): 39.65 err/s: 0.00 reconn/s: 0.00
[ 622s ] thds: 6 tps: 317.00 qps: 17670.96 (r/w/o: 4432.99/12603.97/634.00) lat (ms,95%): 30.81 err/s: 0.00 reconn/s: 0.00
[ 623s ] thds: 6 tps: 299.83 qps: 16896.41 (r/w/o: 4202.61/12094.14/599.66) lat (ms,95%): 25.28 err/s: 0.00 reconn/s: 0.00
[ 624s ] thds: 6 tps: 0.00 qps: 6.00 (r/w/o: 0.00/6.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 625s ] thds: 6 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 626s ] thds: 6 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
[ 627s ] thds: 6 tps: 0.00 qps: 0.00 (r/w/o: 0.00/0.00/0.00) lat (ms,95%): 0.00 err/s: 0.00 reconn/s: 0.00
...

---Expecting to happen---
To not have lines with "tps: 0.00 qps: 0.00", like the last four in the example.

---Additional Notes---
1. This is not happening on every run, so it is possible for some test iterations to complete successfully.
2. The same happens with a larger number of sysbench threads (e.g. 8, 16, 24, 32) too.
3. The problem does not occur if the io policy of the data drive is changed from io="native" to io="io_uring" (at least for 7 hours of continuous testing).
4. While IO operations in the VM are frozen, the NVMe device responds to requests from the host. (e.g. dd if=/dev/nvme1n1 of=/dev/null bs=512 count=1 iflag=direct).

Please find attached the libvirt XML configuration of the example VM.

Best regards,
Nikolay Tenev