Comment 0 for bug 2045503

Revision history for this message
Andrea Righi (arighi) wrote : apply sched-ext patch set to inux-unstable

[Impact]

sched-ext is a new scheduling class introduced in the Linux kernel that provides a mechanism to implement scheduling policies as eBPF programs (https://lwn.net/Articles/922405/). Such programs can also be connected to user-space counterparts to defer scheduling decisions to regular user-space processes.

The idea of "pluggable" schedulers is not new, it was initially proposed in 2004 (https://lwn.net/Articles/109458/), but at that time it was strongly rejected, to prioritize the creation of a single generic scheduler (one to rule them all), that ended up being the “completely fair scheduler” (CFS).

However, with BPF and the sched-ext scheduling class, we now have the possibility to easily and quickly implement and test scheduling policies, making the “pluggable” approach an effective tool for easy experimentation.

The ability to implement custom scheduling policies via BPF greatly lowers the difficulty of testing new scheduling ideas (much easier than changing CFS or replacing it with a different scheduler). With this feature researchers or developers can test their own scheduler in a safe way, without even needing to reboot the system.

Shipping this feature in the Ubuntu kernel can provide a significant benefit to researchers and companies that want to experiment (or ship) their own scheduling policy, implemented as an eBPF/user-space program.

[Test case]

Basic test cases for this feature are provided by the sched-ext patch set. Tests are available in tools/sched_ext.

[Fix]

Apply this patch set as SAUCE:
https://<email address hidden>/T/

Soon there'll be a branch against any kernel that we need here (we will only need 6.7 for now):
https://github.com/sched-ext/sched_ext

[Regression potential]

This feature is not going to be merged upstream in the near future, some upstream maintainers are worried that giving the possibility to inject in the kernel a custom scheduler can introduce performance regressions that are hard to track down.

For this reason we should apply this feature only to linux-unstable for now, making sure that the patch is unapplied or reverted when linux-unstable becomes linux.

In the meantime we can also figure out a reasonable way to determine when a custom scheduler is used (i.e., taint the kernel?) to easily determine when any potential performance regression may have been introduced by a custom sched-ext scheduler.

From a maintenance perspective, having this patch set applied may also be problematic (potential conflicts) when we apply new stable updates. However, the upstream maintainers of sched-ext have expressed interest to help us maintaining the patch set against the target kernel(s) that we need. And targeting linux-unstable only can definitely mitigate the maintenance problem a lot (since we won't have the urgency to apply critical security fixes to linux-unstable).