Heavy system slowdown with 4.8.0 kernel under XenServer

Bug #1637666 reported by Vihai
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned
Yakkety
Won't Fix
High
Unassigned

Bug Description

Hello,

I am experiencing a noticeable system slowdown since I upgraded from Xenial to Yakkety on several VM under XenServer 6.5 (fully patched).

I understand it will be difficult for me to describe what is happening and for you to troubleshoot this issue but I'll try to do my best.

Just after booting the machine is pretty responsive but after some time it starts responding very slowly.

Bash autocompletion for file paths takes 1-2 seconds to respond, commands as "host" take hundred of milliseconds consuming hundred of ms of CPU. Compiling is an order of magnitude slower, etc.

I wasn't able to detect what is causing the slowness. I tried to strace slow process with no evident cause. It does not seem to be something I/O bound.

I ran "host" in an infinite loop causing 100% CPU usage seen from XenServer, 100% CPU usage in top but the sum of CPU usage from the processes is under 10%.

top - 00:51:48 up 1 day, 4:49, 3 users, load average: 1.47, 1.02, 0.59
Tasks: 144 total, 2 running, 142 sleeping, 0 stopped, 0 zombie
%Cpu(s): 74.4 us, 24.4 sy, 0.0 ni, 1.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 4044600 total, 1719740 free, 843100 used, 1481760 buff/cache
KiB Swap: 2095100 total, 2095100 free, 0 used. 3139664 avail Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30968 yggdra 20 0 13460 696 616 R 5.8 0.0 0:00.18 host
28790 root 20 0 42020 3748 3104 R 0.6 0.1 0:02.47 top
[...]

I downgraded to the last Xenial kernel and the issue disappeared.

I now have two VMs, with the same configuration, both yakkety, one with kernel 4.8.0-26-generic and one with 4.4.0-45-generic, on the same XenServer host. Here is what I see running host:

-------- 4.4.0-45 ----------------
$ time host www.google.com
www.google.com has address 216.58.198.4
www.google.com has IPv6 address 2a00:1450:4002:801::2004

real 0m0.051s
user 0m0.032s
sys 0m0.016s

-------- 4.8.0-26 ---------------
$ time host www.google.com
www.google.com has address 216.58.198.4
www.google.com has IPv6 address 2a00:1450:4002:801::2004

real 0m0.412s
user 0m0.296s
sys 0m0.096s

I have other yakkety hosts on VMware and even on another XenServer 7.0 with no noticeable issues, it just seems to happen on this cluster.

In the next days I will upgrade this cluster to XenServer 7.0 and I will update the bug telling if the hypervisor version is a factor.

ProblemType: Bug
DistroRelease: Ubuntu 16.10
Package: linux-image-4.8.0-26-generic 4.8.0-26.28
ProcVersionSignature: Ubuntu 4.8.0-26.28-generic 4.8.0
Uname: Linux 4.8.0-26-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Oct 27 20:02 seq
 crw-rw---- 1 root audio 116, 33 Oct 27 20:02 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.3-0ubuntu8
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Sat Oct 29 00:37:00 2016
HibernationDevice: RESUME=UUID=4ab77eb2-d96a-42d5-be31-b4d499646269
InstallationDate: Installed on 2011-09-19 (1866 days ago)
InstallationMedia: Ubuntu-Server 11.10 "Oneiric Ocelot" - Beta amd64 (20110901)
IwConfig:
 lo no wireless extensions.

 eth1 no wireless extensions.

 eth0 no wireless extensions.
JournalErrors:
 Error: command ['journalctl', '-b', '--priority=warning', '--lines=1000'] failed with exit code 1: Hint: You are currently not seeing messages from other users and the system.
       Users in the 'systemd-journal' group can see all messages. Pass -q to
       turn off this notice.
 No journal files were opened due to insufficient permissions.
Lsusb:
 Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Xen HVM domU
PciMultimedia:

ProcFB: 0 cirrusdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.8.0-26-generic root=UUID=0c869ff8-8fe7-41b9-89f3-9da4882b4e87 ro init=/lib/systemd/systemd
RelatedPackageVersions:
 linux-restricted-modules-4.8.0-26-generic N/A
 linux-backports-modules-4.8.0-26-generic N/A
 linux-firmware 1.161
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: Upgraded to yakkety on 2016-10-14 (14 days ago)
WifiSyslog:

dmi.bios.date: 09/30/2016
dmi.bios.vendor: Xen
dmi.bios.version: 4.4.1-xs129783
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.4.1-xs129783:bd09/30/2016:svnXen:pnHVMdomU:pvr4.4.1-xs129783:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.4.1-xs129783
dmi.sys.vendor: Xen

Revision history for this message
Vihai (daniele-orlandi) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Maybe related: bug 1638278

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Try this kernel (which fixes scheduling issues for other people having trouble on 4.8):

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.9-rc3

Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Yakkety):
status: New → Confirmed
importance: Undecided → High
tags: added: kernel-da-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you also give this kernel a test to see if it's the same sched bug:
http://kernel.ubuntu.com/~jsalisbury/lp1627108/upstream/

Revision history for this message
Vihai (daniele-orlandi) wrote :

Sorry for the delay but I needed physical access to the host box.

I upgraded XenServer to 7.0 (fully patched) and the problem persists.

I tried with kernel 4.9-rc3 and the issue is still there.

I reverted to 4.4.0-45 and the issue disappears.

Revision history for this message
Andy Whitcroft (apw) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie yakkety. The bug task representing the yakkety nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Yakkety):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.