CVE-2020-4788: Speculation on incompletely validated data on IBM Power9

Bug #1899573 reported by Daniel Axtens
268
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Hi,

IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked.

However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack.

This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern.

CVEID: CVE-2020-4788

Current description/CVSS info (subject to revision)
---------------------------------------------------

Description: IBM Power9 processors could allow a local user to obtain sensitive information from the data in the L1 cache under extenuating circumstances.
CVSS Base Score: 2.9
CVSS Temporal Score: https://exchange.xforce.ibmcloud.com/vulnerabilities/189296 for more information
CVSS Vector: (CVSS:3.0/AV:L/AC:H/PR:N/UI:N/S:U/C:L/I:N/A:N)

Embargo details
---------------

Please do not disclose any of this information prior to 20 November 2020, as that is the coordinated date for CVE details, AIX and IBM i fixes to be released. We will confirm the precise time of day shortly.

Fix details
-----------

In general, this issue is mitigated by flushing the L1D cache when entering the kernel and after any in-kernel userspace memory accesses.

Please find attached patches against Focal, Bionic and Xenial. Let me know if you want backports to Groovy done at this stage.

Revision history for this message
Daniel Axtens (daxtens) wrote :
description: updated
Revision history for this message
Daniel Axtens (daxtens) wrote :

We've detected an issue where these patches will cause the `rfi_flush` self-test to fail. This failure does *not* indicate a failure of the RFI flush (kernel exit flush) mechanism. The test measures the number of L1D cache misses with RFI flushing enabled and disabled. The test expects the number of L1D misses to fall when RFI flush is disabled. However, the entry flush is causing the number of flushes to remain high, causing a test failure. If entry flushes are disabled before running the test, it will pass.

We are fixing this and will supply you with an additional patch for supported kernel trees that fixes the test.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hi, Daniel.

I assume the select patch is only a performance fix, and there is no specific vulnerability there. Is that right? We would rather postpone applying that performance fix following our public process.

And I see that backports of the uaccess mitigation is a little more complicated on bionic and xenial as they lack support for UAP. As those backports are more controversial, as they are not complete backports of the upstream patches, I would rather have them also follow our public process for review, unless they are strictly necessary for the mitigation.

So, my question is whether the uaccess mitigation is necessary if the kernel entry mitigation is in place. Could you describe an attack that would not be mitigated by the kernel entry L1D flush only?

Thank you very much.
Cascardo.

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi,

> I assume the select patch is only a performance fix, and there is no specific vulnerability there. Is that right? We would rather postpone applying that performance fix following our public process.

Yes, that's correct. It can go in later - the only downside will be a performance degradation in the poll/select family of syscalls until it goes in.

With regards to your other question, here's my best attempt at an answer.

An attack that would be mitigated by the entry flush would go something like this:

1) Identify a kernel load to target.
2) Using the way predictor, load a userspace address into the L1D cache that will 'collide' with the kernel load.
3) When execution reaches the targeted load, the userspace value will be used for speculative execution.
4) Do something useful with the speculative execution before the CPU realises that the speculated value isn't right.

If you block entry flushes and not user access flushes, you can still do something like this:

0) Identify a place the kernel accesses data from userspace.
1) Identify a kernel load - subsequent to the userspace access - that you want to target.
2) Organise your userspace address space such that when the kernel accesses the data, it pulls in a cache line containing a userspace address that will 'collide' with the kernel load.
3) When execution reaches the targeted load, the userspace value will be used for speculative execution.
4) Do something useful with the speculative execution before the CPU realises that the speculated value isn't right.

As you can see:

 - Step 4 is a bit ambiguous and open-ended: we aren't aware of any way to make use of this issue without something like a specalised kernel module. (This is how the PoC that Google shared with worked.)

 - It's (at least theoretically) more complex to find a useful 'gadget' for speculation if you have to get your data in via user-access rather than being able to prime the cache prior to kernel entry.

I think it is fair to claim that the entry flush, while not a complete fix, is a credible mitigation for this issue. It's ultimately a risk-management decision as to how you would like to proceed here.

For your reference, we've also been looking at other contexts/users where even older kernels are still in support. In those kernels, there is even less of the modern user-access infrastructure and the user-access code is spread all over the kernel. Patching all those sites would involve what we believe to be an unacceptable risk of causing a regression, so we have proposed that they only take the entry-flush patch. Anyone who needs the higher level of security could then decide to move up to a more recent kernel.

I hope that helps - I'm sorry I can't be more definitive on anything! Please let me know if you have any further questions.

I will also post the fixes to the rfi_flush patch shortly, it's just going through some internal review.

Kind regards,
Daniel

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hi, Daniel.

Thanks a lot for the explanations. They make sense, and it seems that we should mitigate the uaccess attack as well. As changes apply only to arch/powerpc/, we at least have less chances for regressions on other architectures.

Can you provide backports to groovy/5.8 kernels too?

And do the fixes for the rfi_flush test apply only to the test code, or do they apply to the kernel code?

Thanks a lot.
Cascardo.

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi,

Yes, I will provide a backport to 5.8 shortly - I'll send you a new tarball with groovy and the fixed tests for all releases.

The fixes to the rfi_flush test are contained to tools/testing/selftests/powerpc/security.

Kind regards,
Daniel

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi,

Please find attached an updated set of patches.

The following changes have been made:

 - Add backports for groovy

 - Focal, Groovy: fix the failing rfi_flush test, and provide an entry_flush test validates the entry flush. (The rfi_flush test is, as far as I can see, not included in any other source trees.)

 - all trees:
    * apply some cleanups identified by checkpatch.pl,
    * fix a bug where feature flags from the hypervisor or firmware would not be honoured.
    * if running bare-metal on a machine that is not a P9, do not apply the flushes as they are not required.

Kind regards,
Daniel

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi,

Just to update you on our embargo date: we expect to be posting to oss-security and linuxppc-dev at:

 - 20 November at 10am UTC+11 (AEDT), which is
 - 19 November at 11pm UTC+0, which is
 - 19 November at 5pm UTC-6 (CST)

Kind regards,
Daniel

Steve Beattie (sbeattie)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hi, Daniel.

Thanks a lot for the CRD information and the new set of patches.

I am attaching comments from Juerg Haefliger, they may provide opportunity for some improvements.

I also worked on a test for uaccess_flush, but as the backports mention a cleanup to unify the RFI and entry flush tests, I am pretty sure it would beneficial to build on top of that. Would you be able to provide the original patches that will go into linuxppc-dev?

Thank you very much.
Cascardo.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :
Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi Cascardo,

Please find attached the patches for upstream (based on linux-next) and all the upstream stable trees (stable/linux-*.y) we're targeting.

The mbox you attached contains gpg encrypted mail, and I don't think I'm able to decrypt it. I'm certainly happy to revise patches for upstream or (more likely at this stage) send a v2 or a fixup to the list incorporating the improvements. We have also sent the patches to some other distros based on their kernel trees, so coordinating improvements across them will be challenging -- we'll see what we can do.

Kind regards,
Daniel

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hi, Daniel.

Sorry about sending encrypted emails. I will go through some of the comments myself, see which ones apply to upstream fixes and send you the ones I find appropriate. They are mostly cosmetic. Some of the other comments are specific to the backports. We can sort them out locally if necessary.

In the meantime, I am attaching a test for the uaccess_flush knob. There is still a lot of shared code between the 3 tests, rfi_flush, entry_flush and uaccess_flush.

Regards.
Cascardo.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hi, Daniel.

For xenial backport of "powerpc: Add a framework for user access tracking", is there any reason barrier_nospec is called after allow_user_access at the following hunk? That is not the case in upstream and bionic backport.

Thanks.
Cascardo.

@@ -328,9 +333,14 @@ extern unsigned long __copy_tofrom_user(void __user *to,
 static inline unsigned long copy_from_user(void *to,
                const void __user *from, unsigned long n)
 {
+ unsigned long ret;
+
        if (likely(access_ok(VERIFY_READ, from, n))) {
+ allow_user_access(to, from, n);
                barrier_nospec();
- return __copy_tofrom_user((__force void __user *)to, from, n);
+ ret = __copy_tofrom_user((__force void __user *)to, from, n);
+ prevent_user_access(to, from, n);
+ return ret;
        }
        memset(to, 0, n);
        return n;

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi Cascardo,

As I understand it:

 - the barrier_nospec() in copy_from_user() was added to the xenial tree in commit bafab44334fc ("powerpc: Use barrier_nospec in copy_from_user()") (LP: #1830176, pulling in a stable commit based on ddf35cf3764b upstream). It exists to prevent some forms of speculation on a pointer provided by userspace.

 - the entire copy_from_user() routine was removed from arch/powerpc/include/asm/uaccess.h in upstream commit 3448890c32c3 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER"), where Al Viro - in his trademark terse style - switch powerpc away from providing its own copy_to|from_user routines to the generic ones in include/linux/uaccess.h . This has meant that since v4.12 there hasn't been a copy_from_user() in arch/powerpc/include/asm/uaccess.h. Instead, the barrier_nospec() calls now live in raw_copy_from_user.

Does that answer your question?

Kind regards,
Daniel

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hi, Daniel.

My question was more in respect to allow_user_access being called before barrier_nospec in xenial, versus barrier_nospec called before allow_user_access and allow_read_from_user in bionic. As the bionic backport below:

@@ -297,16 +302,22 @@ extern unsigned long __copy_tofrom_user(void __user *to,
 static inline unsigned long
 raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)
 {
+ unsigned long ret;
+
        barrier_nospec();
- return __copy_tofrom_user(to, from, n);
+ allow_user_access(to, from, n);
+ ret = __copy_tofrom_user(to, from, n);
+ prevent_user_access(to, from, n);
+ return ret;
 }
 #endif /* __powerpc64__ */

 static inline unsigned long raw_copy_from_user(void *to,
                const void __user *from, unsigned long n)
 {
+ unsigned long ret;
        if (__builtin_constant_p(n) && (n <= 8)) {
- unsigned long ret = 1;
+ ret = 1;

                switch (n) {
                case 1:
@@ -331,14 +342,18 @@ static inline unsigned long raw_copy_from_user(void *to,
        }

        barrier_nospec();
- return __copy_tofrom_user((__force void __user *)to, from, n);
+ allow_read_from_user(from, n);
+ ret = __copy_tofrom_user((__force void __user *)to, from, n);
+ prevent_read_from_user(from, n);
+ return ret;
 }

On focal or 5.4 upstream, the same happens. barrier_nospec is used before allow_user_access or allow_read_from_user are called.

Thanks.
Cascardo.

Steve Beattie (sbeattie)
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Revision history for this message
Daniel Axtens (daxtens) wrote :

Ah, I see.

That's probably me just not being sufficiently consistent on the backport. I don't think it will make any difference as allow_user_access/read_from_user are no-ops in the Xenial backport. So there's nothing that uses the user pointer such that the ordering of the flush is critical. But if I'm wrong I'm happy to be corrected.

Thanks for your attention to detail: it's really exemplary work you and the broader kernel team are doing here.

Kind regards,
Daniel

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi,

A small update: our PSIRT released the CVE with a different vector and score. Because the kernel maps all the physical memory, an attack making use of this vulnerability could, in theory and with an appropriate gadget, access highly sensitive data. Therefore, PSIRT ultimately classified the confidentiality impact as high, rather than low. This makes the score 5.1 rather than 2.9.

We don't believe this scoring reflects any greater risk to your users, but I want to be transparent, explain the discrepancy, and hopefully alleviate any confusion. Apologies for the lack of communication on this before the end of the embargo - we are working to improve our internal processes here.

Kind regards,
Daniel

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi,

Would you mind if I forward-ported your uaccess flush test to mainline and proposed it for inclusion in the powerpc tree?

Kind regards,
Daniel

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hi, Daniel.

Sure, go ahead with the forward-port.

Thanks.
Cascardo.

Revision history for this message
Steve Beattie (sbeattie) wrote :

Hey Danial, is there any reason for this bug report to stay private, now that we're long past the embargo date?

Thanks.

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi Steve,

It should be fine to make this public now.

Kind regards,
Daniel

Revision history for this message
Steve Beattie (sbeattie) wrote :
information type: Private Security → Public Security
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.