Comment 27 for bug 1833322

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: Consider removing irqbalance from default install on desktop images

Hi Etanay,

I realize I maybe wrote too much :-/
So I start with a TL;DR:
AFAICS you are right in all you say, but I think there can not be "one right answer" anyway. Hence I'm trying to leave all parties their freedom of defining what is important to them and try to learn from them what impact irqbalance has to that.

> Yes I was not arguing strictly against irqbalance, just trying
> to ascertain some discussion parameters as well as parameters for data
> collection.

Yeah, I see that and didn't intend to rebut your statements either.
Just push them a bit into potential context and POV of others.

> I have not yet seen a coherent philosophy on what it means to "optimize
> performance" with default settings that serve the greatest capacity of
> server or desktop scenarios.

That is true, but the reason for that is that you can only optimize for
something like a workload or particular HW.

The defaults are usually trying to be not too crappy for any possible
thing that might happen on e.g. Ubuntu which is quite a scope.

> In my humble opinion, data collection is useless without this
> framework of understanding what it is we are trying to achieve
> and why in terms of system performance. To me this is the deeper
> unresolved issue, perhaps.

I can see your point and would not even argue against. But this is
(this is opinion and a bit of experience, not scientific proven
truth) only the problem if we'd try to solve the singular global
and always valid "is irqbalance good or bad" question.

Thinking about it I think I'm even of the same opinion than you,
but instead of standardizing excatly what we are trying to achieve
(which to me feels like selecting a workload or HW as optimization
target) I was trying to reach out to as many groups as possible
so we can see what HW/workloads are important to them and how
irqbalance might help or interfere with that.

A bit like the old case where some clouds brought it up that it is
conflicting in virtio-net on their substrate and to be disabled
by default there (see Debian and also some Ubuntu cloud images).

I have personally no hope in reaching a general "this is good / bad"
without considering it per workload or HW environment.

Hence my hope is that if we manage to get this variety of preferences
of different parties and only then the impact of irqbalance to that
we can make compartmentalized decisions.
For example as some suggested, making it no more the default in
Desktop, but keeping it in other cases.

And this is just me trying to be helpful and drive this from being
a dormant case to something useful, I do not pretend to have the
masterplan or the solution yet :-)

> I fear that systems are currently optimized by default for throughput. For
> users, responsiveness (which can include but is not limited to throughput)
> and latency may be more important psychologically

Can I just say yes here, you go into lengths explaining (thanks) but I
already agreed here :-)

Yet - as true as that is - it is true for a set of workloads and hardware,
but not for all that Ubuntu can be (as I outlined above neither decision
could be true for all)

> And power saving is important in global terms, as even small gains
> multiplied over hundreds or thousands of deployments can have a
> significant impact

True as well, yet - again - most servers are often split by some virt
solution to pay off by their price running at high utilization.
There to reach density often people are ok to forfeit some latency
for overall throughput and thereby density which saves power by
having x% less systems active at all.

P.S. I'm now waiting for further input by all of you that found the thread so far as well as hopefully
some of all the teams, hardware manufacturers and clouds that I have connected to please think about this question.

P.P.S. I'm drifting away of seeing a big deja-vu into my decade of
Linux on mainframe performance - and density and performance and
interfering workloads that invalidated all you knew when looking
at just one ... and you know what the answer always was and still is:
"it depends" as any performance engineer will love to tell you :-)