[Feature] Break up long walk of wait queue during wake up
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
intel |
Fix Released
|
Undecided
|
Unassigned | ||
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Description
In Fujitsu's acceptance test of 8 socket SKX systems, they found that NMI watchdog timer got triggered in wake_up_page_bit function. The cause is the long hold time of the wait queue lock to do the wake ups and traversal of the wait queue. (see LCK-4265)
We created a patch series to break up the long traversal of the wait queue in wake_up_page_bit. We bookmark the position in the wait queue after reaching a threshold of wake ups, and release the wait queue lock to allow other tasks blocked a chance to run (e.g. async wake up to remove itself from wake queue in finish_wait). Then we resume from the bookmarked location. This fixes the long spin lock hold time which triggered the watchdog timer.
Target Kernel: 4.14
Target Release: 18.04
Merged in kernel v4.14-rc1. 00a8dec52bfbb84 75e89b6745 8ce93e6f74a12fd 7fe430a004
These are commit ids:
11a19c7b099f96d
2554db916586b22