Comment 37 for bug 1480349

Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote : Re: [Bug 1480349] Re: Intel Microcode Breaks frequency scaling in Xeon® E5-2687W v3 & E5-1650 v3

On Thu, Jan 21, 2016, at 03:18, Doug Smythies wrote:
> @Philipp: The way I read them, your comment #35 and your comment #27
> contradict each other.
>
> @Xiong: The way I read all of this stuff, it has not been proven that
> the issue is in the microcode itself. However, the work done by Sharar
> in earlier postings, in my opinion, narrows it down to either the
> microcode itself or some loading issue when it is updated during boot.

To me, so far it looks like the issue is caused by a three-way
interaction: microcode - kernel - platform/BIOS.

There are several users of Xeon E5v3 with up-to-date microcode (0x36)
that do not observe the issue. Since not every box with the same
processor, microcode and kernel (but not the same *mainboard*) suffers
the issue, there's a platform component that is required to trigger the
issue. Maybe the firmware is not doing everything it should, or the
Linux driver is not doing everything it should: the microcode update
might not be the root cause.

> I had an idea to use "iucode_tool" to further isolate the issue, by
> going back and forth between microcodes, but now I see that one can not
> downgrade the microcode on the fly, it only works for upgrading. So now

That's untested, unspecified, unsupported territory. Avoid it unless
someone @intel that knows better tells you to do it: just because a
microcode downgrade looks like it worked fine doesn't mean it did, after
all... Especially since the issue in this bug report might well be
related to insufficient new-state sanitization after a microcode update,
so downgrading the microcode might invalidate the testing entirely...

It is much better to start from the microcode that works (in the BIOS),
and test if you can still reproduce the issue with normal microcode
updates (i.e. no downgrading).

> I am thinking maybe it would be possible to make the same microcode with
> a newer version number to "trick" the system into loading it. I.E. if
> upgrading to the fake "newer" microcode during boot also caused issues,
> then the root issue would be the load procedure.

Yes, it is possible. That's the main component of an Intel microcode
downgrade attack, which works pretty much everywhere (and not just in
Linux).

--
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique de Moraes Holschuh <email address hidden>