gcc-9 in Ubuntu generate unbootable xen hypervisor

Bug #1863260 reported by Stefan Bader on 2020-02-14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gcc-9 (Ubuntu)

Bug Description

This problem was first observed in Ubuntu 19.10 (Eoan) but persists in the current Focal development. The results are the same for the old Xen version (4.9.2) and the fresh 4.11 merge. As a test I took the xen-hypervisor binary from Debian Sid (which has been compiled with gcc-9 as well) and that succeeds in booting Xen (libraries and dom0 kernel are Ubuntu Focal).

There seems to be one major difference between build environments. The Xen hypervisor build is done with an external retpoline mitigation:

# Compile with thunk-extern, indirect-branch-register if avaiable.
ifneq ($(call cc-option,$(CC),-mindirect-branch-register,n),n)
CFLAGS += -mindirect-branch=thunk-extern -mindirect-branch-register

The use of -mindirect-branch is incompatible with -fcf-protection. To be able to build the hypervisor in Ubuntu, I have to add the following modifications to the make file:

--- xen.orig/xen/arch/x86/Rules.mk
+++ xen/xen/arch/x86/Rules.mk
@@ -42,6 +42,10 @@ endif

 # Compile with thunk-extern, indirect-branch-register if avaiable.
 ifneq ($(call cc-option,$(CC),-mindirect-branch-register,n),n)
+ifneq ($(call cc-option,$(CC),-fcf-protection,n),n)
+CFLAGS += -fcf-protection=none
+CXXFLAGS += -fcf-protection=none
 CFLAGS += -mindirect-branch=thunk-extern -mindirect-branch-register

I am wondering whether -fcf-protection=none is only partially effective and causes the binary to be unbootable (basically it causes an immediate reset when started).

Stefan Bader (smb) wrote :
Stefan Bader (smb) wrote :
Stefan Bader (smb) on 2020-02-28
tags: added: aeppelwoi
Stefan Bader (smb) wrote :

I have re-compiled gcc-9 from Focal after adding it to the list of releases which do not enable cf-protection by default. Then compiled Xen from Focal with that compiler. And it does boot!

# xen info
host : argabuthon
release : 5.4.0-14-generic
version : #17-Ubuntu SMP Thu Feb 6 22:47:59 UTC 2020
machine : x86_64
nr_cpus : 8
max_cpu_id : 15
nr_nodes : 2
cores_per_socket : 8
threads_per_core : 1
cpu_mhz : 2000.014
hw_caps : 178bf3ff:80802001:ee500800:000837ff:00000000:00000000:00000000:00000100
virt_caps : hvm
total_memory : 32766
free_memory : 29333
sharing_freed_memory : 0
sharing_used_memory : 0
outstanding_claims : 0
free_cpus : 0
xen_major : 4
xen_minor : 11
xen_extra : .4-pre
xen_version : 4.11.4-pre
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset :
xen_commandline : placeholder dom0_max_vcpus=4 dom0_vcpus_pin=true dom0_mem=3072M,max:3072M loglvl=all guest_loglvl=all hvm_debug=0 com2=115200,8n1 console=com2,vga
cc_compiler : gcc (Ubuntu 9.2.1-30ubuntu2+nocfprot1) 9.2.1 20200228
cc_compile_by : ubuntu-devel-di
cc_compile_domain : lists.ubuntu.com
cc_compile_date : Tue Mar 3 07:31:57 UTC 2020
build_id : e9eeb5132ed2948c175c17be4b1c47edded6d7be
xend_config_format : 4

Stefan Bader (smb) wrote :

Advice how to disable cf-protaction when enabled by default was taken from: https://wiki.ubuntu.com/ToolChain/CompilerFlags#A-fcf-protection

Stefan Bader (smb) wrote :

While modifying gcc I had the feeling that maybe -fno-cf-protection might be a valid flag, too. Should that be the same as -fcf-protection=none?

Stefan Bader (smb) wrote :

We have found a fix for the issue now. There was one part of the hypervisor build which does use completely different flags. In particular it does not use -mindirect-branch, so it will not fail if cf-protection is turned on (this is under xen/arch/x86/boot and uses a buikd32.mk for setting flags). After adding -fcf-protection=none to build32.mk as well, the resulting hypervisor binary boots.

@doko, just one additional observation: the documentation sounds a bit like -mindirect-branch=thunk-external is only incompatible with -fcf-protection=full|branch but practically it seems only none is accepted. That might be correct or not, I am not sure.

Think after all, we can close this as invalid (for gcc-9).

Matthias Klose (doko) on 2020-03-11
Changed in gcc-9 (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers