Comment 7 for bug 1833322

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: Consider removing irqbalance from default install on desktop images

# Summary

This discussion was seeminly easier to make the more dedicated to a singluar
use case you are - as then you have less "but what if" cases to consider.
That wide usage is great for Ubuntu but sometimes delays decisions.

List of reasons to remove it from the default dependencies:
- Seems to cause issues more often on Desktop environments
- cpufreq, thermald and similar struggle to save energy
- Impacts due to unepexcted throttling
- Conflicts with enabling/disabling threads/cores
- Problematic in virtual environments
- It is mostly an x86 thing but we pull it in everywhere
- It conflicts with manually fine tuned IRQ affinity e.g. in
  ultra low latency setups
- It is less useful on cpus with large and wide shared caches
  as well as in virtual environments without fix pinning

List of reasons to keep it in the set of default dependencies:
- Benefits seem mostly for large scale servers
- lacking irqbalance can be a performance degradation in some
  large scale high traffic cases

I think from all I've found - old and new - it seems it still has its purpose
in some scenarios, but the HW/SW world evolved and it is nowadays less often
useful and more often harmful than it was in the past.
On the other hand there is almost no clear cut "it is bad and that is why",
most issues were individual issues and special cases, nothing that would
apply to everyone.

And irqbalance still has is purpose, so we should surely keep it around.

In a perfect worlds this would have half a year of time or more and two people
to run all kinds of workloads on all kinds of HW to compare. But let us be
honest that will not happen and that would then also be not be worth the effort.
We'll have to decide with what we have.
Have the others that switched have more time to evaluate in depth, I do not
know. But usually once a significant amount of the ecosystems changed and you
lack better data it is better to also follow or common hints and optimizations
will no more apply due to being the one outlier in regard to behavior.

To me this seems to be a perfect case for a few special images/deployments
known to match the workload profile that needs this to enable it.
It is also more likely that a professional admin of such a large scale machine
(or cluster thereof) can make the opt-in decision and evaluation better than
expectint every user of Ubuntu to think about an opt-out.

---

Options IMHO:
A) Change it from an opt-out to an opt-in and remove the dependency
   from ubuntu-standard
B) Remove it from ubuntu-standard to get rid of it in Desktops and images
   used in virtual environments. But try to keep it in a place that is mostly
   used for bare metal which tend to be closer to the kind that benefits more
C) Do nothing, keep it as is

D) Any of the above, but let us not touch Noble more than half way through the
cycle, but do that early in 24.10 to have enough exposure before a release in
an LTS.

My gut feeling (and it can't be much more without much more time for much
deeper investigations) would be (A).