Ubuntu 15.10: QEMU VM hang if memory >= 1T

Bug #1562653 reported by changlimin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned
qemu (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

1. Ubuntu 15.10 x86_64 installed on HP SuperDome X with 8CPUs and 4T memory.

2. Create a VM, install Ubuntu 15.10, if memory >= 1T , VM hang when start. If memory < 1T, it is good.
<domain type='kvm'>
  <name>u1510-1</name>
  <uuid>39eefe1e-4829-4843-b892-026d143f3ec7</uuid>
  <memory unit='KiB'>1073741824</memory>
  <currentMemory unit='KiB'>1073741824</currentMemory>
  <vcpu placement='static'>16</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.3'>hvm</type>
    <boot dev='hd'/>
    <boot dev='cdrom'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='directsync'/>
      <source file='/vms/images/u1510-1.img'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='usb' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='0c:da:41:1d:ae:f1'/>
      <source bridge='vswitch0'/>
      <model type='virtio'/>
      <driver name='vhost'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </memballoon>
  </devices>
</domain>

3. The panic stack is
  ... cannot show
  async_page_fault+0x28
  ioread32_rep+0x38
  ata_sff_data_xfer32+0x8a
  ata_pio_sector+0x93
  ata_pio_sectors+0x34
  ata_sff_hsm_move+0x226
  RIP: kthread_data+0x10
  CR2: FFFFFFFF_FFFFFFD8

4. Change the host os to Redhat 7.2 , the vm is good even memory >=3.8T.

Revision history for this message
changlimin (changlimin) wrote :

I delete cdrom and IDE controller, the vm start sucessfully.

But when I increate memory to 1100G, vm hang at hpet_enable when start.

The panic is page_fault when execute hpet_period = hpet_readl(HPET_PERIOD);

It seems that ioremap_nocache does not works correctly.

hpet_virt_address = ioremap_nocache(hpet_address, HPET_MMAP_SIZE);

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi,

just to be sure, if you run

kvm -vnc :1 -m 1.5G
kvm -vnc :1 -m 1.5G --no-hpet

do those also crash?

Can you please show the contents of

/var/log/libvirt/qemu/u1510-1.log

affects: kvm (Ubuntu) → qemu (Ubuntu)
Changed in qemu (Ubuntu):
status: New → Incomplete
Revision history for this message
changlimin (changlimin) wrote :

I mean vm hang when memory >= 1100G (1024G when enable ide cdrom) instead of 1.5G.

If disable hpet, the vm will hang at acpi_ex_system_memory_space_handler when memory >= 1100G

If disable kvm, vm is good when memory <= 1020G, but also hang when memory >= 1024G.

There is no critical information in vm.log:

Revision history for this message
changlimin (changlimin) wrote :

After I changed PHYS_ADDR_MASK, qemu vm can start when memory >=1024G , but kvm vm still hang.

-# define PHYS_ADDR_MASK 0xffffffffffLL
+# define PHYS_ADDR_MASK 0xfffffffffffLL

Revision history for this message
changlimin (changlimin) wrote :

The issue is sloved after change cpuid[80000008];

--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2547,7 +2547,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
             /* 64 bit processor */
 /* XXX: The physical address space is limited to 42 bits in exec.c. */
- *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */
+ *eax = 0x00003029; /* 48 bits virtual, 41 bits physical */
         } else {
             if (env->features[FEAT_1_EDX] & CPUID_PSE36) {
                 *eax = 0x00000024; /* 36 bits physical */

Revision history for this message
changlimin (changlimin) wrote :

For cpus which have not EPT feature, should change CR3_L_MODE_RESERVED_BITS in arch/x86/include/asm/kvm_host.h:

-#define CR3_L_MODE_RESERVED_BITS 0xFFFFFF0000000000ULL
+#define CR3_L_MODE_RESERVED_BITS 0xFFFFFE0000000000ULL

Revision history for this message
Thomas Huth (th-huth) wrote :

Can you still reproduce this problem with the latest version of upstream QEMU?

Changed in qemu:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for qemu (Ubuntu) because there has been no activity for 60 days.]

Changed in qemu (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I only saw this because it expired now :-/

Anyone affected by this might want to take a look at bug 1776189 where Ubuntu added a special machine type to more easily set "host-phys-bits" which is the qemu flag to have more (usually the host has more) available (at the cost of migratability). That allows <1TB as the default bits in qemu are chosen on the base of TCG (to be able to emulate what is virtualized) and that is limited to 1TB.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.