Comment 9 for bug 1653764

Revision history for this message
Michael Shield (mike.shield) wrote :

Update

There does appear to be a possibility that, despite having a 4 ssd raid10, that the DB was a victim of having too high a vm.dirty_ratio setting (defaults for Centos 7) and large memory (128 GB). Chasing the following errors through RedHat suggests that this a possible cause of the underlying 120 sec timeout.

Nov 21 02:01:01 systemd[1]: Stopping User Slice of root.
Nov 21 02:05:58 kernel: INFO: task mysqld:84452 blocked for more than 120 seconds.
Nov 21 02:05:58 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 21 02:05:58 kernel: mysqld D ffff88103aeaa080 0 84452 1 0x00000080
Nov 21 02:05:58 kernel: ffff880d8e6279a0 0000000000000082 ffff880dc4fd8fd0 ffff880d8e627fd8
Nov 21 02:05:58 kernel: ffff880d8e627fd8 ffff880d8e627fd8 ffff880dc4fd8fd0 ffff88103e9d6cc0
Nov 21 02:05:58 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff88103aeaa080
Nov 21 02:05:58 kernel: Call Trace:
Nov 21 02:05:58 kernel: [<ffffffff816a9589>] schedule+0x29/0x70
Nov 21 02:05:58 kernel: [<ffffffff816a7099>] schedule_timeout+0x239/0x2c0
Nov 21 02:05:58 kernel: [<ffffffffc0002df7>] ? dm_make_request+0x127/0x190 [dm_mod]
Nov 21 02:05:58 kernel: [<ffffffff812f8c05>] ? generic_make_request+0x105/0x310
Nov 21 02:05:58 kernel: [<ffffffff810e93ac>] ? ktime_get_ts64+0x4c/0xf0
Nov 21 02:05:58 kernel: [<ffffffff816a8c0d>] io_schedule_timeout+0xad/0x130
Nov 21 02:05:58 kernel: [<ffffffff816a8ca8>] io_schedule+0x18/0x20
Nov 21 02:05:58 kernel: [<ffffffff8124227d>] do_blockdev_direct_IO+0x1bdd/0x2050
Nov 21 02:05:58 kernel: [<ffffffffc02ffa00>] ? xfs_find_bdev_for_inode+0x20/0x20 [xfs]
Nov 21 02:05:58 kernel: [<ffffffff81242745>] __blockdev_direct_IO+0x55/0x60
Nov 21 02:05:58 kernel: [<ffffffffc02ffa00>] ? xfs_find_bdev_for_inode+0x20/0x20 [xfs]
Nov 21 02:05:58 kernel: [<ffffffffc02ffa40>] ? xfs_get_blocks_dax_fault+0x20/0x20 [xfs]
Nov 21 02:05:58 kernel: [<ffffffffc030b758>] xfs_file_dio_aio_write+0x188/0x390 [xfs]
Nov 21 02:05:58 kernel: [<ffffffffc02ffa00>] ? xfs_find_bdev_for_inode+0x20/0x20 [xfs]
Nov 21 02:05:58 kernel: [<ffffffffc02ffa40>] ? xfs_get_blocks_dax_fault+0x20/0x20 [xfs]
Nov 21 02:05:58 kernel: [<ffffffffc030bd22>] xfs_file_aio_write+0x102/0x1b0 [xfs]
Nov 21 02:05:58 kernel: [<ffffffff812001ed>] do_sync_write+0x8d/0xd0
Nov 21 02:05:58 kernel: [<ffffffff81200cad>] vfs_write+0xbd/0x1e0
Nov 21 02:05:58 kernel: [<ffffffff81201c72>] SyS_pwrite64+0x92/0xc0
Nov 21 02:05:58 kernel: [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b