Azure: TDX enabled hyper-visors cause segfault
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-azure (Ubuntu) |
Fix Released
|
High
|
Tim Gardner |
Bug Description
SRU Justification
[Impact]
Microsoft TDX enabled hyper visors cause a segfault due to an upstream glibc bug. This can be worked around with a kernel patch.
Issue Description:
When I start an Intel TDX Ubuntu 22.04 (or RHEL 9.0) guest on Hyper-V, the guest always hits segfaults and can’t boot up. Here the kernel running in the guest is the upstream kernel + my TDX patchset, or the 5.19.0-azure kernel + the same TDX patchset:
[Fix]
We confirmed the segfault also happens to TDX guests on the KVM hypervisor. After I checked with more Intel folks, it turns out this is indeed a glibc bug (https:/
I got a kernel side temporary workarouond from Intel: https:/
[ 21.081453] Run /inits init process
[ 21.086896] with arguments:
[ 21.095790] /init
[ 21.100982] with environment:
[ 21.106611] HOME=/
[ 21.112463] TERM=linux
[ 21.119850] BOOT_IMAGE=
Loading, please wait...
Starting version 249.11-0ubuntu3.6
[ 21.253908] udevadm[144]: segfault at 56538d61e0c0 ip 00007f8f5899efeb sp 00007ffd08fb7648 error 6 in libc.so.
[ 21.316549] Code: 07 62 e1 7d 48 e7 4f 01 62 e1 7d 48 e7 67 40 62 e1 7d 48 e7 6f 41 62 61 7d 48 e7 87 00 20 00 00 62 61 7d 48 e7 8f 40 20 00 00 <62> 61 7d 48 e7 a7 00 30 00 00 62 61 7d 48 e7 af 40 30 00 00 48 83
Segmentation fault
[ 22.499317] setfont[153]: segfault at 55ef3b91b000 ip 00007f5899899fa4 sp 00007ffc8008f628 error 4 in libc.so.
[ 22.602677] Code: 06 62 e1 fe 48 6f 4e 01 62 e1 fe 48 6f 66 40 62 e1 fe 48 6f 6e 41 62 61 fe 48 6f 86 00 20 00 00 62 61 fe 48 6f 8e 40 20 00 00 <62> 61 fe 48 6f a6 00 30 00 00 62 61 fe 48 6f ae 40 30 00 00 48 83
[ 22.732413] loadkeys[156]: segfault at 563ffe292000 ip 00007fbff957afa4 sp 00007ffe31453808 error 4 in libc.so.
[ 22.833061] Code: 06 62 e1 fe 48 6f 4e 01 62 e1 fe 48 6f 66 40 62 e1 fe 48 6f 6e 41 62 61 fe 48 6f 86 00 20 00 00 62 61 fe 48 6f 8e 40 20 00 00 <62> 61 fe 48 6f a6 00 30 00 00 62 61 fe 48 6f ae 40 30 00 00 48 83
The segfault only happens to recent glibc versions (e.g. v2.35 in Ubuntu 22.04, and v2.34 in RHEL 9.0). It doesn’t happens to v2.31 in Ubuntu 20.04, or v2.32 in Ubuntu 20.10. So something in glibc must have changed between v2.32 (good) and 2.34+ (not working for TDX). The oddity is: when I run the same Ubuntu 22.04/RHEL 9.0 image as a regular non-TDX guest, the segfault never happens.
If I boot up a Ubuntu 20.04 TDX guest (which works fine), mount a Ubuntu 22.04 VHD image (“mount /dev/sdd1 /mnt”) and try to run “chroot /mnt”, I hit the same segfault:
[ 109.478556] EXT4-fs (sdd1): mounted filesystem with ordered data mode. Quota mode: none.
[ 129.224444] bash[2112]: segfault at 556987854000 ip 00007f88468c4ea4 sp 00007ffc22ecf158 error 6 in libc.so.
[ 129.242434] Code: e7 bf 30 10 00 00 66 44 0f e7 87 00 20 00 00 66 44 0f e7 8f 10 20 00 00 66 44 0f e7 97 20 20 00 00 66 44 0f e7 9f 30 20 00 00 <66> 44 0f e7 a7 00 30 00 00 66 44 0f e7 af 10 30 00 00 66 44 0f e7
It looks like the application is referencing a memory location that somehow triggers a page fault, which is converted to a sigal SIGSEGV, which causes a segfault and terminates the application (I’m not sure where the below “movntdq” instructions come from):
root@decui-
Code: e7 bf 30 10 00 00 66 44 0f e7 87 00 20 00 00 66 44 0f e7 8f 10 20 00 00 66 44 0f e7 97 20 20 00 00 66 44 0f e7 9f 30 20 00 00 <66> 44 0f e7 a7 00 30 00 00 66 44 0f e7 af 10 30 00 00 66 44 0f e7
All code
========
0: e7 bf out %eax,$0xbf
2: 30 10 xor %dl,(%rax)
4: 00 00 add %al,(%rax)
6: 66 44 0f e7 87 00 20 movntdq %xmm8,0x2000(%rdi)
d: 00 00
f: 66 44 0f e7 8f 10 20 movntdq %xmm9,0x2010(%rdi)
16: 00 00
18: 66 44 0f e7 97 20 20 movntdq %xmm10,0x2020(%rdi)
1f: 00 00
21: 66 44 0f e7 9f 30 20 movntdq %xmm11,0x2030(%rdi)
28: 00 00
2a:* 66 44 0f e7 a7 00 30 movntdq %xmm12,0x3000(%rdi)
<-- trapping instruction
31: 00 00
33: 66 44 0f e7 af 10 30 movntdq %xmm13,0x3010(%rdi)
3a: 00 00
3c: 66 data16
3d: 44 rex.R
3e: 0f .byte 0xf
3f: e7 .byte 0xe7
Code starting with the faulting instruction
=======
0: 66 44 0f e7 a7 00 30 movntdq %xmm12,0x3000(%rdi)
7: 00 00
9: 66 44 0f e7 af 10 30 movntdq %xmm13,0x3010(%rdi)
10: 00 00
12: 66 data16
13: 44 rex.R
14: 0f .byte 0xf
15: e7 .byte 0xe7
After I add a delay of “sleep 2 minutes” in the kernel’s arch/x86/
[ 129.224444] bash[2112]: segfault at 556987854000 ip 00007f88468c4ea4 sp 00007ffc22ecf158 error 6 in libc.so.
root@decui-
5569874a9000-
5569874d8000-
5569875b7000-
5569875f2000-
5569875f6000-
5569875ff000-
556987833000-
[heap]
7f8846400000-
7f8846800000-
7f8846828000-
7f88469bd000-
7f8846a15000-
7f8846a19000-
7f8846a1b000-
7f8846b09000-
7f8846b10000-
7f8846b13000-
7f8846b21000-
7f8846b32000-
7f8846b40000-
7f8846b44000-
7f8846b4b000-
7f8846b4d000-
7f8846b4f000-
7f8846b79000-
7f8846b85000-
7f8846b87000-
7ffc22eb1000-
[stack]
7ffc22fcd000-
7ffc22fd1000-
[Test Plan]
Microsoft tested
[Where things could go wrong]
TDX is a new feature and is unlikely to have regressions.
affects: | linux (Ubuntu) → linux-azure (Ubuntu) |
Changed in linux-azure (Ubuntu): | |
assignee: | nobody → Tim Gardner (timg-tpi) |
importance: | Undecided → High |
status: | New → In Progress |
description: | updated |
description: | updated |
tags: |
added: verification-done-kinetic removed: verification-needed-kinetic |
tags: |
added: verification-done-lunar removed: verification-needed-lunar |
FYI, the glibc bug is not https:/ /sourceware. org/bugzilla/ show_bug. cgi?id= 28784; instead, it's Bug 30037 - glibc 2.34 and newer segfault if CPUID leaf 0x2 reports zero (https:/ /sourceware. org/bugzilla/ show_bug. cgi?id= 30037)