Bug #1899573 “CVE-2020-4788: Speculation on incompletely validat...” : Bugs : linux package : Ubuntu

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-10-13:

#1

Patches for Focal, Bionic and Xenial Edit (190.0 KiB, application/x-tar)

description:

updated

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-10-30:

#2

We've detected an issue where these patches will cause the `rfi_flush` self-test to fail. This failure does *not* indicate a failure of the RFI flush (kernel exit flush) mechanism. The test measures the number of L1D cache misses with RFI flushing enabled and disabled. The test expects the number of L1D misses to fall when RFI flush is disabled. However, the entry flush is causing the number of flushes to remain high, causing a test failure. If entry flushes are disabled before running the test, it will pass.

We are fixing this and will supply you with an additional patch for supported kernel trees that fixes the test.

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2020-11-10:

#3

Hi, Daniel.

I assume the select patch is only a performance fix, and there is no specific vulnerability there. Is that right? We would rather postpone applying that performance fix following our public process.

And I see that backports of the uaccess mitigation is a little more complicated on bionic and xenial as they lack support for UAP. As those backports are more controversial, as they are not complete backports of the upstream patches, I would rather have them also follow our public process for review, unless they are strictly necessary for the mitigation.

So, my question is whether the uaccess mitigation is necessary if the kernel entry mitigation is in place. Could you describe an attack that would not be mitigated by the kernel entry L1D flush only?

Thank you very much.
Cascardo.

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-11-12:

#4

Hi,

> I assume the select patch is only a performance fix, and there is no specific vulnerability there. Is that right? We would rather postpone applying that performance fix following our public process.

Yes, that's correct. It can go in later - the only downside will be a performance degradation in the poll/select family of syscalls until it goes in.

With regards to your other question, here's my best attempt at an answer.

An attack that would be mitigated by the entry flush would go something like this:

1) Identify a kernel load to target.
2) Using the way predictor, load a userspace address into the L1D cache that will 'collide' with the kernel load.
3) When execution reaches the targeted load, the userspace value will be used for speculative execution.
4) Do something useful with the speculative execution before the CPU realises that the speculated value isn't right.

If you block entry flushes and not user access flushes, you can still do something like this:

0) Identify a place the kernel accesses data from userspace.
1) Identify a kernel load - subsequent to the userspace access - that you want to target.
2) Organise your userspace address space such that when the kernel accesses the data, it pulls in a cache line containing a userspace address that will 'collide' with the kernel load.
3) When execution reaches the targeted load, the userspace value will be used for speculative execution.
4) Do something useful with the speculative execution before the CPU realises that the speculated value isn't right.

As you can see:

- Step 4 is a bit ambiguous and open-ended: we aren't aware of any way to make use of this issue without something like a specalised kernel module. (This is how the PoC that Google shared with worked.)

- It's (at least theoretically) more complex to find a useful 'gadget' for speculation if you have to get your data in via user-access rather than being able to prime the cache prior to kernel entry.

I think it is fair to claim that the entry flush, while not a complete fix, is a credible mitigation for this issue. It's ultimately a risk-management decision as to how you would like to proceed here.

For your reference, we've also been looking at other contexts/users where even older kernels are still in support. In those kernels, there is even less of the modern user-access infrastructure and the user-access code is spread all over the kernel. Patching all those sites would involve what we believe to be an unacceptable risk of causing a regression, so we have proposed that they only take the entry-flush patch. Anyone who needs the higher level of security could then decide to move up to a more recent kernel.

I hope that helps - I'm sorry I can't be more definitive on anything! Please let me know if you have any further questions.

I will also post the fixes to the rfi_flush patch shortly, it's just going through some internal review.

Kind regards,
Daniel

Hi,

> I assume the select patch is only a performance fix, and there is no specific vulnerability there. Is that right? We would rather postpone applying that performance fix following our public process.

Yes, that's correct. It can go in later - the only downside will be a performance degradation in the poll/select family of syscalls until it goes in.

With regards to your other question, here's my best attempt at an answer.

An attack that would be mitigated by the entry flush would go something like this:

1) Identify a kernel load to target.
2) Using the way predictor, load a userspace address into the L1D cache that will 'collide' with the kernel load.
3) When execution reaches the targeted load, the userspace value will be used for speculative execution.
4) Do something useful with the speculative execution before the CPU realises that the speculated value isn't right.

If you block entry flushes and not user access flushes, you can still do something like this:

0) Identify a place the kernel accesses data from userspace.
1) Identify a kernel load - subsequent to the userspace access - that you want to target.
2) Organise your userspace address space such that when the kernel accesses the data, it pulls in a cache line containing a userspace address that will 'collide' with the kernel load.
3) When execution reaches the targeted load, the userspace value will be used for speculative execution.
4) Do something useful with the speculative execution before the CPU realises that the speculated value isn't right.

As you can see:

- Step 4 is a bit ambiguous and open-ended: we aren't aware of any way to make use of this issue without something like a specalised kernel module. (This is how the PoC that Google shared with worked.)

- It's (at least theoretically) more complex to find a useful 'gadget' for speculation if you have to get your data in via user-access rather than being able to prime the cache prior to kernel entry.

I think it is fair to claim that the entry flush, while not a complete fix, is a credible mitigation for this issue. It's ultimately a risk-management decision as to how you would like to proceed here.

For your reference, we've also been looking at other contexts/users where even older kernels are still in support. In those kernels, there is even less of the modern user-access infrastructure and the user-access code is spread all over the kernel. Patching all those sites would involve what we believe to be an unacceptable risk of causing a regression, so we have proposed that they only take the entry-flush patch. Anyone who needs the higher level of security could then decide to move up to a more recent kernel.

I hope that helps - I'm sorry I can't be more definitive on anything! Please let me know if you have any further questions.

I will also post the fixes to the rfi_flush patch shortly, it's just going through some internal review.

Kind regards,
Daniel

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2020-11-12:

#5

Hi, Daniel.

Thanks a lot for the explanations. They make sense, and it seems that we should mitigate the uaccess attack as well. As changes apply only to arch/powerpc/, we at least have less chances for regressions on other architectures.

Can you provide backports to groovy/5.8 kernels too?

And do the fixes for the rfi_flush test apply only to the test code, or do they apply to the kernel code?

Thanks a lot.
Cascardo.

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-11-12:

#6

Hi,

Yes, I will provide a backport to 5.8 shortly - I'll send you a new tarball with groovy and the fixed tests for all releases.

The fixes to the rfi_flush test are contained to tools/testing/selftests/powerpc/security.

Kind regards,
Daniel

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-11-16:

#7

updated set of patches Edit (250.0 KiB, application/x-tar)

Hi,

Please find attached an updated set of patches.

The following changes have been made:

- Add backports for groovy

- Focal, Groovy: fix the failing rfi_flush test, and provide an entry_flush test validates the entry flush. (The rfi_flush test is, as far as I can see, not included in any other source trees.)

- all trees:
    * apply some cleanups identified by checkpatch.pl,
    * fix a bug where feature flags from the hypervisor or firmware would not be honoured.
    * if running bare-metal on a machine that is not a P9, do not apply the flushes as they are not required.

Kind regards,
Daniel

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-11-17:

#8

Hi,

Just to update you on our embargo date: we expect to be posting to oss-security and linuxppc-dev at:

- 20 November at 10am UTC+11 (AEDT), which is
- 19 November at 11pm UTC+0, which is
- 19 November at 5pm UTC-6 (CST)

Kind regards,
Daniel

Steve Beattie (sbeattie) on 2020-11-17

Changed in linux (Ubuntu):
status:	New → Confirmed

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2020-11-18:

#9

Hi, Daniel.

Thanks a lot for the CRD information and the new set of patches.

I am attaching comments from Juerg Haefliger, they may provide opportunity for some improvements.

I also worked on a test for uaccess_flush, but as the backports mention a cleanup to unify the RFI and entry flush tests, I am pretty sure it would beneficial to build on top of that. Would you be able to provide the original patches that will go into linuxppc-dev?

Thank you very much.
Cascardo.

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2020-11-18:

#10

Comments on latest backports Edit (43.5 KiB, application/mbox)

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-11-18:

#11

upstream and stable patches Edit (450.0 KiB, application/x-tar)

Hi Cascardo,

Please find attached the patches for upstream (based on linux-next) and all the upstream stable trees (stable/linux-*.y) we're targeting.

The mbox you attached contains gpg encrypted mail, and I don't think I'm able to decrypt it. I'm certainly happy to revise patches for upstream or (more likely at this stage) send a v2 or a fixup to the list incorporating the improvements. We have also sent the patches to some other distros based on their kernel trees, so coordinating improvements across them will be challenging -- we'll see what we can do.

Kind regards,
Daniel

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2020-11-19:

#12

Adds support for testing uaccess_flush Edit (7.4 KiB, text/plain)

Hi, Daniel.

Sorry about sending encrypted emails. I will go through some of the comments myself, see which ones apply to upstream fixes and send you the ones I find appropriate. They are mostly cosmetic. Some of the other comments are specific to the backports. We can sort them out locally if necessary.

In the meantime, I am attaching a test for the uaccess_flush knob. There is still a lot of shared code between the 3 tests, rfi_flush, entry_flush and uaccess_flush.

Regards.
Cascardo.

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2020-11-23:

#13

Hi, Daniel.

For xenial backport of "powerpc: Add a framework for user access tracking", is there any reason barrier_nospec is called after allow_user_access at the following hunk? That is not the case in upstream and bionic backport.

Thanks.
Cascardo.

@@ -328,9 +333,14 @@ extern unsigned long __copy_tofrom_user(void __user *to,
static inline unsigned long copy_from_user(void *to,
                const void __user *from, unsigned long n)
{
+ unsigned long ret;
+
        if (likely(access_ok(VERIFY_READ, from, n))) {
+ allow_user_access(to, from, n);
                barrier_nospec();
- return __copy_tofrom_user((__force void __user *)to, from, n);
+ ret = __copy_tofrom_user((__force void __user *)to, from, n);
+ prevent_user_access(to, from, n);
+ return ret;
        }
        memset(to, 0, n);
        return n;

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-11-23:

#14

Hi Cascardo,

As I understand it:

- the barrier_nospec() in copy_from_user() was added to the xenial tree in commit bafab44334fc ("powerpc: Use barrier_nospec in copy_from_user()") (LP: #1830176, pulling in a stable commit based on ddf35cf3764b upstream). It exists to prevent some forms of speculation on a pointer provided by userspace.

- the entire copy_from_user() routine was removed from arch/powerpc/include/asm/uaccess.h in upstream commit 3448890c32c3 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER"), where Al Viro - in his trademark terse style - switch powerpc away from providing its own copy_to|from_user routines to the generic ones in include/linux/uaccess.h . This has meant that since v4.12 there hasn't been a copy_from_user() in arch/powerpc/include/asm/uaccess.h. Instead, the barrier_nospec() calls now live in raw_copy_from_user.

Does that answer your question?

Kind regards,
Daniel

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2020-11-23:

#15

Hi, Daniel.

My question was more in respect to allow_user_access being called before barrier_nospec in xenial, versus barrier_nospec called before allow_user_access and allow_read_from_user in bionic. As the bionic backport below:

@@ -297,16 +302,22 @@ extern unsigned long __copy_tofrom_user(void __user *to,
static inline unsigned long
raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)
{
+ unsigned long ret;
+
barrier_nospec();
- return __copy_tofrom_user(to, from, n);
+ allow_user_access(to, from, n);
+ ret = __copy_tofrom_user(to, from, n);
+ prevent_user_access(to, from, n);
+ return ret;
}
#endif /* __powerpc64__ */

static inline unsigned long raw_copy_from_user(void *to,
const void __user *from, unsigned long n)
{
+ unsigned long ret;
if (__builtin_constant_p(n) && (n <= 8)) {
- unsigned long ret = 1;
+ ret = 1;

                switch (n) {
                case 1:
@@ -331,14 +342,18 @@ static inline unsigned long raw_copy_from_user(void *to,
        }

barrier_nospec();
- return __copy_tofrom_user((__force void __user *)to, from, n);
+ allow_read_from_user(from, n);
+ ret = __copy_tofrom_user((__force void __user *)to, from, n);
+ prevent_read_from_user(from, n);
+ return ret;
}

On focal or 5.4 upstream, the same happens. barrier_nospec is used before allow_user_access or allow_read_from_user are called.

Thanks.
Cascardo.

Steve Beattie (sbeattie) on 2020-11-29

Changed in linux (Ubuntu):
status:	Confirmed → Fix Committed

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-11-30:

#16

Ah, I see.

That's probably me just not being sufficiently consistent on the backport. I don't think it will make any difference as allow_user_access/read_from_user are no-ops in the Xenial backport. So there's nothing that uses the user pointer such that the ordering of the flush is critical. But if I'm wrong I'm happy to be corrected.

Thanks for your attention to detail: it's really exemplary work you and the broader kernel team are doing here.

Kind regards,
Daniel

Revision history for this message

Daniel Axtens (daxtens) wrote on 2020-12-02:

#17

Hi,

A small update: our PSIRT released the CVE with a different vector and score. Because the kernel maps all the physical memory, an attack making use of this vulnerability could, in theory and with an appropriate gadget, access highly sensitive data. Therefore, PSIRT ultimately classified the confidentiality impact as high, rather than low. This makes the score 5.1 rather than 2.9.

We don't believe this scoring reflects any greater risk to your users, but I want to be transparent, explain the discrepancy, and hopefully alleviate any confusion. Apologies for the lack of communication on this before the end of the embargo - we are working to improve our internal processes here.

Kind regards,
Daniel

Revision history for this message

Daniel Axtens (daxtens) wrote on 2021-02-22:

#18

Hi,

Would you mind if I forward-ported your uaccess flush test to mainline and proposed it for inclusion in the powerpc tree?

Kind regards,
Daniel

Revision history for this message

Thadeu Lima de Souza Cascardo (cascardo) wrote on 2021-02-22:

#19

Hi, Daniel.

Sure, go ahead with the forward-port.

Thanks.
Cascardo.

Revision history for this message

Steve Beattie (sbeattie) wrote on 2021-02-22:

#20

Hey Danial, is there any reason for this bug report to stay private, now that we're long past the embargo date?

Thanks.

Revision history for this message

Daniel Axtens (daxtens) wrote on 2021-02-23:

#21

Hi Steve,

It should be fine to make this public now.

Kind regards,
Daniel

Revision history for this message

Steve Beattie (sbeattie) wrote on 2021-02-23:

#22

Oh, this was fixed in https://usn.ubuntu.com/usn/usn-4657-1, https://usn.ubuntu.com/usn/usn-4658-1, https://usn.ubuntu.com/usn/usn-4659-1, and https://usn.ubuntu.com/usn/usn-4660-1 . Marking fix released.

Thanks.

information type:	Private Security → Public Security
Changed in linux (Ubuntu):
status:	Fix Committed → Fix Released

Ubuntu
linux package

CVE-2020-4788: Speculation on incompletely validated data on IBM Power9

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches

Ubuntulinux package

CVE-2020-4788: Speculation on incompletely validated data on IBM Power9

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches

Ubuntu
linux package