Unable to create virtual machine with large amounts memory / cpu
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
libvirt (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Invalid
|
Undecided
|
Unassigned | ||
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Invalid
|
Undecided
|
Unassigned |
Bug Description
System: Intel SDP, Xeon(R) Gold 6252 CPU, 96 Core, 1708.5 GiB Memory (DCPMM Optane Memory)
Series: Bionic
Kernel: 4.15.0-51-generic #55
Arch: AMD64
Libvirt: 4.0.0-1ubuntu8.10
**Update**
Disco does not appear to be affected by this problem, as using the same steps to reproduce, I am able to successfully create a virtual machine with 50 cores and 1 TB of memory. Logs are attached to this bug.
Problem:
We have discovered that virtual machines created via libvirt are unable to start in Bionic when assigning 50 cores & 1 Terrabyte of memory to them.
The following tests were done using the disco cloud image, attempting to boot a disco VM with 50 cores and 1 TB of memory. The full console log is attached to this bug.
[ 15.229175] NET: Registered protocol family 10
[ 15.231941] Segment Routing with IPv6
[ 15.232523] NET: Registered protocol family 17
[ 15.233141] BUG: unable to handle kernel paging request at ffff9d35c5a16880
[ 15.233392] Key type dns_resolver registered
[ 15.235863] #PF error: [PROT] [WRITE] [RSVD]
[ 15.235863] PGD fcf1e05067 P4D fcf1e05067 PUD 10788a6c063 PMD 8000010785a000e3
[ 15.236967] Oops: 000b [#1] SMP PTI
[ 15.242373] CPU: 26 PID: 456 Comm: kworker/26:1 Not tainted 5.0.0-16-generic #17-Ubuntu
[ 15.242373] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 15.242373] Workqueue: ata_sff ata_sff_pio_task
[ 15.242373] RIP: 0010:ioread32_
[ 15.242373] Code: 48 8d 54 8e 04 8b 07 89 06 48 83 c6 04 48 39 d6 75 f3 5d c3 48 81 ff 00 00 01 00 76 27 0f b7 c7 89 c2 0f 1f 44 00 00 48 89 f7 <f3> 6d 5d c3 31 ff 48 85 c9 74 dd ed 89 04 be 48 83 c7 01 48 39 f9
[ 15.242373] RSP: 0000:ffffa9bbda
[ 15.255328] RAX: 00000000000001f0 RBX: 0000000000000200 RCX: 0000000000000080
[ 15.255328] RDX: 00000000000001f0 RSI: ffff9d35c5a16880 RDI: ffff9d35c5a16880
[ 15.255328] RBP: ffffa9bbda423d48 R08: 0000000000000000 R09: 006666735f617461
[ 15.255328] R10: 8080808080808080 R11: 0000000000000001 R12: ffff9d35c5a16880
[ 15.255328] R13: 00000000000101f0 R14: 0000000000000000 R15: ffff9d35c5a16368
[ 15.255328] FS: 000000000000000
[ 15.255328] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.255328] CR2: ffff9d35c5a16880 CR3: 000000fcf140e001 CR4: 00000000003606e0
[ 15.255328] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 15.255328] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 15.255328] Call Trace:
[ 15.255328] ata_sff_
[ 15.255328] ? __switch_
[ 15.255328] ata_pio_
[ 15.255328] ata_pio_
[ 15.255328] ata_sff_
[ 15.255328] ? __switch_
[ 15.255328] ? __switch_
[ 15.255328] ? __switch_
[ 15.255328] ? __switch_
[ 15.255328] ata_sff_
[ 15.255328] process_
[ 15.255328] worker_
[ 15.255328] kthread+0x120/0x140
[ 15.255328] ? process_
[ 15.255328] ? __kthread_
[ 15.255328] ret_from_
[ 15.255328] Modules linked in:
[ 15.255328] CR2: ffff9d35c5a16880
[ 15.255328] ---[ end trace 047af05ecf201244 ]---
[ 15.255328] RIP: 0010:ioread32_
[ 15.255328] Code: 48 8d 54 8e 04 8b 07 89 06 48 83 c6 04 48 39 d6 75 f3 5d c3 48 81 ff 00 00 01 00 76 27 0f b7 c7 89 c2 0f 1f 44 00 00 48 89 f7 <f3> 6d 5d c3 31 ff 48 85 c9 74 dd ed 89 04 be 48 83 c7 01 48 39 f9
[ 15.255328] RSP: 0000:ffffa9bbda
[ 15.283402] RAX: 00000000000001f0 RBX: 0000000000000200 RCX: 0000000000000080
[ 15.283402] RDX: 00000000000001f0 RSI: ffff9d35c5a16880 RDI: ffff9d35c5a16880
[ 15.283402] RBP: ffffa9bbda423d48 R08: 0000000000000000 R09: 006666735f617461
[ 15.283402] R10: 8080808080808080 R11: 0000000000000001 R12: ffff9d35c5a16880
[ 15.283402] R13: 00000000000101f0 R14: 0000000000000000 R15: ffff9d35c5a16368
[ 15.283402] FS: 000000000000000
[ 15.283402] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.283402] CR2: ffff9d35c5a16880 CR3: 000000fcf140e001 CR4: 00000000003606e0
[ 15.283402] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 15.283402] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Steps to Reproduce:
1.) Download latest cloud image.
2.) qemu-img convert -O qcow2 <Cloud-IMG>
3.) qemu-img resize <IMG.QCOW2> +5GB
4.) Generate Cloud Config to allow login
cat > config <<EOF
#cloud-config
password: ubuntu
chpasswd: { expire: False }
ssh_pwauth: True
EOF
5.) cloud-localds config.img config
6.) sudo virt-install --connect=
Alternatively, using uvtool binaries, uvt-kvm yields a somewhat similar yet not identical outcome,
Steps to Reproduce with uvtool:
1.) $uvt-simplestre
2.) $uvt-kvm create test rehlease=disco arch=amd64 --memory 1096000 --cpu 50
The console reveals that the vm will boot slightly longer, but appears to hang at this point.
[ 18.085051] Magic number: 7:675:581
[ 18.086091] memory memory6696: hash matches
[ 18.086973] memory memory6015: hash matches
[ 18.087924] memory memory4574: hash matches
[ 18.088821] memory memory3738: hash matches
[ 18.089746] memory memory2647: hash matches
[ 18.090690] memory memory1361: hash matches
[ 18.091573] memory memory574: hash matches
[ 18.092660] rtc_cmos 00:00: setting system clock to 2019-06-05T12:34:17 UTC (1559738057)
[ 18.097307] Freeing unused decrypted memory: 2040K
[ 18.099181] Freeing unused kernel image memory: 2576K
[ 18.109790] Write protecting the kernel read-only data: 22528k
[ 18.112213] Freeing unused kernel image memory: 2016K
[ 18.114252] Freeing unused kernel image memory: 1852K
[ 18.135085] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 18.136415] x86/mm: Checking user space page tables
[ 18.148461] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 18.149546] Run /init as init process
Loading, please wait...
Starting version 240
There does not appear to be a stack trace when using uvtools, versus creating the VM manually via downloading the cloud img and using virt-inst.
---
ProblemType: Bug
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Jun 5 14:43 seq
crw-rw---- 1 root audio 116, 33 Jun 5 14:43 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 18.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp.
Bus 001 Device 002: ID 0000:0001
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Intel Corporation S2600WFD
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=xterm-
PATH=(custom, no user)
XDG_RUNTIME_
LANG=C.UTF-8
SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=
ProcVersionSign
RelatedPackageV
linux-
linux-
linux-firmware 1.173.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: bionic uec-images
Uname: Linux 4.15.0-51-generic x86_64
UnreportableReason: This report is about a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy libvirt lxd netdev plugdev sudo video
_MarkForUpload: False
dmi.bios.date: 02/27/2019
dmi.bios.vendor: Intel Corporation
dmi.bios.version: SE5C620.
dmi.board.
dmi.board.name: S2600WFD
dmi.board.vendor: Intel Corporation
dmi.board.version: J46732-610
dmi.chassis.
dmi.chassis.type: 23
dmi.chassis.vendor: .......
dmi.chassis.
dmi.modalias: dmi:bvnIntelCor
dmi.product.family: Family
dmi.product.name: S2600WFD
dmi.product.
dmi.sys.vendor: Intel Corporation
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1831763
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.