[ppc64el] Hugepages w/ split-core freezes cpu

Bug #1698846 reported by Rafael Folco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kvm (Ubuntu)
New
Undecided
Unassigned

Bug Description

Hugepages + split-core + NUMA + pinned CPUs make a simple apt-get install for multiple packages to hang the CPU. Killing the process and re-running the command would work (the packages are installed). Installing single packages also works fine, so it only happens for multiple packages.

Reproduce
=========
- reboot host
- ppc64_cpu --smt=on
- ppc64_cpu --sub-cores-per-core=4
-- ppc64_cpu --smt=off
- for i in 0 1 16 17; do echo 1300 > /sys/devices/system/node/node$i/hugepages/hugepages-16384kB/nr_hugepages; done
- for s in libvirt-bin nova-compute nova-api; do service $s restart; done
- created a vm with the following config:
...
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <shares>8192</shares>
    <vcpupin vcpu='0' cpuset='56'/>
    <vcpupin vcpu='1' cpuset='58'/>
    <vcpupin vcpu='2' cpuset='60'/>
    <vcpupin vcpu='3' cpuset='62'/>
    <vcpupin vcpu='4' cpuset='72'/>
    <vcpupin vcpu='5' cpuset='74'/>
    <vcpupin vcpu='6' cpuset='76'/>
    <vcpupin vcpu='7' cpuset='78'/>
    <emulatorpin cpuset='56,58,60,62,72,74,76,78'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='16'/>
    <memnode cellid='0' mode='strict' nodeset='16'/>
  </numatune>
...
  <cpu mode='host-model'>
    <model fallback='allow'/>
    <topology sockets='1' cores='4' threads='2'/>
    <numa>
      <cell id='0' cpus='0-7' memory='10485760' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
...
- apt-get install <pkg1> <pkg2> <pkg3>...
- cpu hangs, 100% busy

apt-get
=======

sudo DEBIAN_FRONTEND=noninteractive http_proxy= https_proxy= no_proxy= apt-get --option Dpkg::Options::=--force-confold --assume-yes install apache2 apache2-dev bc bridge-utils bsdmainutils curl g++ gcc gettext git graphviz iputils-ping libapache2-mod-proxy-uwsgi libffi-dev libjpeg-dev libmysqlclient-dev libpq-dev libssl-dev libsystemd-dev libxml2-dev libxslt1-dev libyaml-dev lsof openssh-server openssl pkg-config psmisc python2.7 python-dev python-gdbm screen tar tcpdump unzip uuid-runtime wget wget zlib1g-dev lvm2 open-iscsi qemu-utils thin-provisioning-tools lvm2 open-iscsi qemu-utils thin-provisioning-tools dstat libkrb5-dev libldap2-dev libsasl2-dev memcached python-mysqldb sqlite3 conntrack curl dnsmasq-base dnsmasq-utils ebtables gawk genisoimage iptables iputils-arping kpartx libjs-jquery-tablesorter libmysqlclient-dev parted pm-utils python-mysqldb socat sqlite3 sudo vlan cryptsetup genisoimage gir1.2-libosinfo-1.0 netcat-openbsd open-iscsi qemu-utils sg3-utils sysfsutils ipset acl dnsmasq-base ebtables haproxy iptables iputils-arping iputils-ping libmysqlclient-dev postgresql-server-dev-all python-mysqldb sqlite3 sudo vlan conntrack conntrackd keepalived curl liberasurecode-dev make memcached sqlite3 xfsprogs

top
===

 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13231 root 20 0 40192 33792 6848 R 100.0 0.3 1:02.68 apt-get

strace
======
... <truncated>
                    = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=7883, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
close(16) = 0
waitpid(7883, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 7883
stat("/var/lib/dpkg/status", {st_mode=S_IFREG|0644, st_size=818492, ...}) = 0
stat("/var/lib/dpkg/status", {st_mode=S_IFREG|0644, st_size=818492, ...}) = 0
open("/var/lib/dpkg/status", O_RDONLY) = 16
fcntl(16, F_SETFD, FD_CLOEXEC) = 0
read(16, "Package: python-apt-common\nStatus: install ok installed\nPriority: optional\nSection: python\nInstalled-Size: 244\nMaintainer: Ubuntu Developers <email address hidden>\nArchitecture: all\nSource: python-apt\nVersion: 1.1.0~beta1build1\nReplaces: python-apt (<< 0.7.98+nmu1)\nDepends: python | python3\nBreaks: python-apt (<< 0.7.98+nmu1)\nEnhances: python-apt, python3-apt\nDescription: Python interface to libapt-pkg (locales)\n The apt_pkg Python interface will provide full access to the internal\n libapt-pkg structures allowing Python programs to easily perform a\n variety of functions.\n .\n This package contains locales.\nOriginal-Maintainer: APT Development Team <email address hidden>\n\nPackage: node-lockfile\nStatus: install ok installed\nPriority: extra\nSection: web\nInstalled-Size: 43\nMaintainer: Ubuntu Developers <email address hidden>\nArchitecture: all\nVersion: 0.4.1-1\nDepends: nodejs\nDescription: Asynchronous file lock module for Node.js\n This module provides asynchronous and synchronous concurrent file\n locking. It supports timeouts, expirations, and retrying upon failure.\n .\n Node.js is an event-based server-side javascript engine.\nOriginal-Maintainer: Debian Javascript Maintainers <email address hidden>\nHomepage: https://github.com/isaacs/lockfile\n\nPackage: libgtk2.0-bin\nStatus: install ok installed\nPriority: optional\nSection: misc\nInstalled-Size: 80\nMaintainer: Ubuntu Desktop Team <email address hidden>\nArchitecture: ppc64el\nMulti-Arch: foreign\nSource: gtk+2.0\nVersion: 2.24.30-1ubuntu1\nDepends: libgtk2.0-0 (= 2.24.30-1ubuntu1), libgtk2.0-common\nDescription: programs for the GTK+ graphical user interface library\n GTK+ is a multi-platform toolkit for creating graphical user\n interfaces. Offering a complete set of widgets, GTK+ is suitable\n for projects ranging from small one-off tools to complete application\n suites.\n .\n This package contains the utilities which are used by the libraries\n and other packages.\nHomepage: http://www.gtk.org/\nOriginal-Maintain"..., 32771) = 32771
fstat(16, {st_mode=S_IFREG|0644, st_size=818492, ...}) = 0
fstat(16, {st_mode=S_IFREG|0644, st_size=818492, ...}) = 0

libvirt xml
===========

<domain type='kvm' id='13'>
  <name>instance-0002fd88</name>
  <uuid>36c7dc00-b230-4cb2-a28d-43f81ba5fa80</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="13.1.2"/>
      <nova:name>devstack-xenial-smt-ppc64-smt-151554</nova:name>
      <nova:creationTime>2017-06-14 16:41:33</nova:creationTime>
      <nova:flavor name="numa-hugepages">
        <nova:memory>10240</nova:memory>
        <nova:disk>80</nova:disk>
        <nova:swap>10240</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>8</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="d17cb3d4398f4c8c84f96d78e3ff50e8">pkvmci</nova:user>
        <nova:project uuid="7b62ae2fea4545a2b6b68c9024aafbf1">pkvmci</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="6b6f077d-1860-4864-838d-84ea1085afee"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>10485760</memory>
  <currentMemory unit='KiB'>10485760</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='16384' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <cputune>
    <shares>8192</shares>
    <vcpupin vcpu='0' cpuset='56'/>
    <vcpupin vcpu='1' cpuset='58'/>
    <vcpupin vcpu='2' cpuset='60'/>
    <vcpupin vcpu='3' cpuset='62'/>
    <vcpupin vcpu='4' cpuset='72'/>
    <vcpupin vcpu='5' cpuset='74'/>
    <vcpupin vcpu='6' cpuset='76'/>
    <vcpupin vcpu='7' cpuset='78'/>
    <emulatorpin cpuset='56,58,60,62,72,74,76,78'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='16'/>
    <memnode cellid='0' mode='strict' nodeset='16'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='ppc64le' machine='pseries-2.6'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
    <topology sockets='1' cores='4' threads='2'/>
    <numa>
      <cell id='0' cpus='0-7' memory='10485760' unit='KiB' memAccess='shared'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='writeback'/>
      <source file='/raid-data/instances/36c7dc00-b230-4cb2-a28d-43f81ba5fa80/disk'/>
      <backingStore type='file' index='1'>
        <format type='raw'/>
        <source file='/raid-data/instances/_base/8b0287808ddb3011d589c92ff4e9474a4efa4700'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='writeback'/>
      <source file='/raid-data/instances/36c7dc00-b230-4cb2-a28d-43f81ba5fa80/disk.swap'/>
      <backingStore type='file' index='1'>
        <format type='raw'/>
        <source file='/raid-data/instances/_base/swap_10240'/>
        <backingStore/>
      </backingStore>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/raid-data/instances/36c7dc00-b230-4cb2-a28d-43f81ba5fa80/disk.config'/>
      <backingStore/>
      <target dev='sdz' bus='scsi'/>
      <readonly/>
      <alias name='scsi3-0-0-4'/>
      <address type='drive' controller='3' bus='0' target='0' unit='4'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='scsi' index='0'>
      <alias name='scsi0'/>
      <address type='spapr-vio' reg='0x2000'/>
    </controller>
    <controller type='scsi' index='1'>
      <alias name='scsi1'/>
      <address type='spapr-vio' reg='0x3000'/>
    </controller>
    <controller type='scsi' index='2'>
      <alias name='scsi2'/>
      <address type='spapr-vio' reg='0x4000'/>
    </controller>
    <controller type='scsi' index='3'>
      <alias name='scsi3'/>
      <address type='spapr-vio' reg='0x5000'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:10:d5:b4'/>
      <source bridge='brq47fb4423-56'/>
      <target dev='tapaf9bc01f-18'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </interface>
    <serial type='file'>
      <source path='/raid-data/instances/36c7dc00-b230-4cb2-a28d-43f81ba5fa80/console.log'/>
      <target port='0'/>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30000000'/>
    </serial>
    <serial type='pty'>
      <source path='/dev/pts/6'/>
      <target port='1'/>
      <alias name='serial1'/>
      <address type='spapr-vio' reg='0x30001000'/>
    </serial>
    <console type='file'>
      <source path='/raid-data/instances/36c7dc00-b230-4cb2-a28d-43f81ba5fa80/console.log'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30000000'/>
    </console>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
    </input>
    <input type='keyboard' bus='usb'>
      <alias name='input1'/>
    </input>
    <input type='mouse' bus='usb'>
      <alias name='input2'/>
    </input>
    <graphics type='vnc' port='5905' autoport='yes' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='vga' vram='16384' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <stats period='10'/>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
    <panic model='pseries'/>
  </devices>
</domain>

More apt-get install examples
=============================

sudo DEBIAN_FRONTEND=noninteractive http_proxy= https_proxy= no_proxy= apt-get --option Dpkg::Options::=--force-confold --assume-yes install libvirt-bin libvirt-dev

Processing triggers for libc-bin (2.23-0ubuntu7) ...
Processing triggers for systemd (229-4ubuntu17) ...

^C

$ sudo DEBIAN_FRONTEND=noninteractive http_proxy= https_proxy= no_proxy= apt-get --option Dpkg::Options::=--force-confold --assume-yes install fakeroot make openvswitch-switch
Reading package lists... 0%

^C
rfolco@devstack-xenial-smt-ppc64-smt-151554:~$ sudo DEBIAN_FRONTEND=noninteractive http_proxy= https_proxy= no_proxy= apt-get --option Dpkg::Options::=--force-confold --assume-yes install fakeroot make openvswitch-switch
Reading package lists... 0%

c^C
rfolco@devstack-xenial-smt-ppc64-smt-151554:~$ ^C
rfolco@devstack-xenial-smt-ppc64-smt-151554:~$ ^C
rfolco@devstack-xenial-smt-ppc64-smt-151554:~$ sudo DEBIAN_FRONTEND=noninteractive http_proxy= https_proxy= no_proxy= apt-get --option Dpkg::Options::=--force-confold --assume-yes install fakeroot make openvswitch-switch
Reading package lists... 0%

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.