------- Comment From <email address hidden> 2017-08-11 07:44 EDT-------
Testing shows that this commit appears to fix the problem. After 20 hours, no evidence of stalled I/Os.
This fixes a problem introduced by having two modes of operation for cfq that each uses a different timebase, and not having separate scheduling delay (time limit before forcing I/O submit) settings. What appears to be the default mode, "iops", ended up using a delay that allowed I/Os to be postponed for up to 200000000 jiffies (which is hundreds of hours).
------- Comment From <email address hidden> 2017-08-11 07:44 EDT-------
Testing shows that this commit appears to fix the problem. After 20 hours, no evidence of stalled I/Os.
https:/ /git.kernel. org/pub/ scm/linux/ kernel/ git/torvalds/ linux.git/ commit/ ?id=5be6b75610c efd1e21b98a2182 11922c2feb6e08
This fixes a problem introduced by having two modes of operation for cfq that each uses a different timebase, and not having separate scheduling delay (time limit before forcing I/O submit) settings. What appears to be the default mode, "iops", ended up using a delay that allowed I/Os to be postponed for up to 200000000 jiffies (which is hundreds of hours).