virsh start of virtual guest domain fails with internal error due to low default aio-max-nr sysctl value

Bug #1717224 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
Medium
Canonical Server
kvm (Ubuntu)
Won't Fix
Undecided
Skipper Bug Screeners
linux (Ubuntu)
Fix Released
Medium
Joseph Salisbury
Xenial
Fix Released
Medium
Joseph Salisbury
Artful
Fix Released
Medium
Joseph Salisbury
procps (Ubuntu)
Won't Fix
Undecided
Unassigned
Xenial
Won't Fix
Undecided
Unassigned
Artful
Won't Fix
Undecided
Unassigned

Bug Description

Starting virtual guests via on Ubuntu 16.04.2 LTS installed with its KVM hypervisor on an IBM Z14 system LPAR fails on the 18th guest with the following error:

root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70038
error: Failed to start domain zs93kag70038
error: internal error: process exited while connecting to monitor: 2017-07-26T01:48:26.352534Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70038.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device

The previous 17 guests started fine:

root@zm93k8# virsh start zs93kag70020
Domain zs93kag70020 started

root@zm93k8# virsh start zs93kag70021
Domain zs93kag70021 started

.
.

root@zm93k8:/rawimages/ubu1604qcow2# virsh start zs93kag70036
Domain zs93kag70036 started

We ended up fixing the issue by adding the following line to /etc/sysctl.conf :

fs.aio-max-nr = 4194304

... then, reload the sysctl config file:

root@zm93k8:/etc# sysctl -p /etc/sysctl.conf
fs.aio-max-nr = 4194304

Now, we're able to start more guests...

root@zm93k8:/etc# virsh start zs93kag70036
Domain zs93kag70036 started

The default value was originally set to 65535:

root@zm93k8:/rawimages/ubu1604qcow2# cat /proc/sys/fs/aio-max-nr
65536

Note, we chose the 4194304 value, because this is what our KVM on System Z hypervisor ships as its default value. Eg. on our zKVM system:

[root@zs93ka ~]# cat /proc/sys/fs/aio-max-nr
4194304

ubuntu@zm93k8:/etc$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial
ubuntu@zm93k8:/etc$

ubuntu@zm93k8:/etc$ dpkg -s qemu-kvm |grep Version
Version: 1:2.5+dfsg-5ubuntu10.8

Is something already documented for Ubuntu KVM users warning them about the low default value, and some guidance as to
how to select an appropriate value? Also, would you consider increasing the default aio-max-nr value to something much
higher, to accommodate significantly more virtual guests?

Thanks!

---uname output---
ubuntu@zm93k8:/etc$ uname -a Linux zm93k8 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:12:54 UTC 2017 s390x s390x s390x GNU/Linux

Machine Type = z14

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 See Problem Description.

The problem was happening a week ago, so this may not reflect that activity.

This file was collected on Aug 7, one week after we were hitting the problem. If I need to reproduce the problem and get fresh data, please let me know.

/var/log/messages doesn't exist on this system, so I provided syslog output instead.

All data have been collected too late after the problem was observed over a week ago. If you need me to reproduce the problem and get new data, please let me know. That's not a problem.

Also, we would have to make special arrangements for login access to these systems. I'm happy to run traces and data collection for you as needed. If that's not sufficient, then we'll explore log in access for you.

Thanks... - Scott G.

I was able to successfully recreate the problem and captured / attached new debug docs.

Recreate procedure:

# Started out with no virtual guests running.

ubuntu@zm93k8:/home/scottg$ virsh list
 Id Name State
----------------------------------------------------

# Set fs.aio-max-nr back to original Ubuntu "out of the box" value in /etc/sysctl.conf

ubuntu@zm93k8:~$ tail -1 /etc/sysctl.conf
fs.aio-max-nr = 65536

## sysctl -a shows:

fs.aio-max-nr = 4194304

## Reload sysctl.

ubuntu@zm93k8:~$ sudo sysctl -p /etc/sysctl.conf
fs.aio-max-nr = 65536
ubuntu@zm93k8:~$

ubuntu@zm93k8:~$ sudo sysctl -a |grep fs.aio-max-nr
fs.aio-max-nr = 65536

ubuntu@zm93k8:~$ cat /proc/sys/fs/aio-max-nr
65536

# Attempt to start more than 17 qcow2 virtual guests on the Ubuntu host. Fails on the 18th XML.

Script used to start guests..

ubuntu@zm93k8:/home/scottg$ date;./start_privs.sh
Wed Aug 23 13:21:25 EDT 2017
virsh start zs93kag70015
Domain zs93kag70015 started

Started zs93kag70015 succesfully ...

virsh start zs93kag70020
Domain zs93kag70020 started

Started zs93kag70020 succesfully ...

virsh start zs93kag70021
Domain zs93kag70021 started

Started zs93kag70021 succesfully ...

virsh start zs93kag70022
Domain zs93kag70022 started

Started zs93kag70022 succesfully ...

virsh start zs93kag70023
Domain zs93kag70023 started

Started zs93kag70023 succesfully ...

virsh start zs93kag70024
Domain zs93kag70024 started

Started zs93kag70024 succesfully ...

virsh start zs93kag70025
Domain zs93kag70025 started

Started zs93kag70025 succesfully ...

virsh start zs93kag70026
Domain zs93kag70026 started

Started zs93kag70026 succesfully ...

virsh start zs93kag70027
Domain zs93kag70027 started

Started zs93kag70027 succesfully ...

virsh start zs93kag70028
Domain zs93kag70028 started

Started zs93kag70028 succesfully ...

virsh start zs93kag70029
Domain zs93kag70029 started

Started zs93kag70029 succesfully ...

virsh start zs93kag70030
Domain zs93kag70030 started

Started zs93kag70030 succesfully ...

virsh start zs93kag70031
Domain zs93kag70031 started

Started zs93kag70031 succesfully ...

virsh start zs93kag70032
Domain zs93kag70032 started

Started zs93kag70032 succesfully ...

virsh start zs93kag70033
Domain zs93kag70033 started

Started zs93kag70033 succesfully ...

virsh start zs93kag70034
Domain zs93kag70034 started

Started zs93kag70034 succesfully ...

virsh start zs93kag70035
Domain zs93kag70035 started

Started zs93kag70035 succesfully ...

virsh start zs93kag70036
error: Failed to start domain zs93kag70036
error: internal error: process exited while connecting to monitor: 2017-08-23T17:21:47.131809Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70036.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device

Exiting script ... start zs93kag70036 failed
ubuntu@zm93k8:/home/scottg$

# Show that there are only 17 running guests.

ubuntu@zm93k8:/home/scottg$ virsh list |grep run |wc -l
17

ubuntu@zm93k8:/home/scottg$ virsh list
 Id Name State
----------------------------------------------------
 25 zs93kag70015 running
 26 zs93kag70020 running
 27 zs93kag70021 running
 28 zs93kag70022 running
 29 zs93kag70023 running
 30 zs93kag70024 running
 31 zs93kag70025 running
 32 zs93kag70026 running
 33 zs93kag70027 running
 34 zs93kag70028 running
 35 zs93kag70029 running
 36 zs93kag70030 running
 37 zs93kag70031 running
 38 zs93kag70032 running
 39 zs93kag70033 running
 40 zs93kag70034 running
 41 zs93kag70035 running

# For fun, try starting zs93kag70036 again manually.

ubuntu@zm93k8:/home/scottg$ date;virsh start zs93kag70036
Wed Aug 23 13:27:28 EDT 2017
error: Failed to start domain zs93kag70036
error: internal error: process exited while connecting to monitor: 2017-08-23T17:27:30.031782Z qemu-kvm: -drive file=/guestimages/data1/zs93kag70036.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native: Could not open backing file: Could not set AIO state: Inappropriate ioctl for device

# Show the XML (they're all basically the same)...

ubuntu@zm93k8:/home/scottg$ cat zs93kag70036.xml
<domain type='kvm'>
  <name>zs93kag70036</name>
  <memory unit='MiB'>4096</memory>
  <currentMemory unit='MiB'>2048</currentMemory>
  <vcpu placement='static'>2</vcpu>
  <os>
    <type arch='s390x' machine='s390-ccw-virtio'>hvm</type>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>preserve</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name ='qemu' type='qcow2' cache='none' io='native'/>
      <source file='/guestimages/data1/zs93kag70036.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0000'/>
      <boot order='1'/>
    </disk>
    <interface type='network'>
      <source network='privnet1'/>
      <model type='virtio'/>
      <mac address='52:54:00:70:d0:36'/>
      <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/>
    </interface>
<!--
    <disk type='block' device='disk'>
      <driver name ='qemu' type='raw' cache='none'/>
      <source dev='/dev/disk/by-id/dm-uuid-mpath-36005076802810e5540000000000006e4'/>
      <target dev='vde' bus='virtio'/>
      <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0005'/>
      <readonly/>
    </disk>
-->
    <disk type='file' device='disk'>
      <driver name ='qemu' type='raw' cache='none' io='native'/>
      <source file='/guestimages/data1/zs93kag70036.prm'/>
      <target dev='vdf' bus='virtio'/>
      <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0006'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/guestimages/data1/zs93kag70036.iso'/>
      <target dev='sda' bus='scsi'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0' model='none'/>
    <memballoon model='none'/>
    <console type='pty'>
      <target type='sclp' port='0'/>
    </console>
  </devices>
</domain>

This condition is very easy to replicate. However, we may be losing this system in the next day or two, so please let me know ASAP if you need any more data. Thank you...

- Scott G.

== Comment: #11 - Viktor Mihajlovski <email address hidden> - 2017-09-14
In order to support many KVM guests it is advisable to raise the aio-max-nr as suggested in the problem description, see also http://kvmonz.blogspot.co.uk/p/blog-page_7.html. I would also suggest that the system default setting is increased.

Revision history for this message
bugproxy (bugproxy) wrote : dmesg output from zm93k8

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-157241 severity-medium targetmilestone-inin16042
Revision history for this message
bugproxy (bugproxy) wrote : sosreport captured Aug 7 2017

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : var/log/syslog collected Aug 7 2017

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New dmesg output captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New sosreport captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New syslog captured Aug 23

Default Comment by Bridge

Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
affects: linux (Ubuntu) → kvm (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
assignee: nobody → Canonical Server Team (canonical-server)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
I think this is more a "is it time to rethink the default fs.aio-max-nr than a qemu problem.
In general pick any limit and you will be able to create a testcase breaking it.
Are all these bugs ... do we need to set all limits to unlimited - I'm sure you agree not to.

Anyway I agree to Viktor that "I would also suggest that the system default setting is increased" as this could be hit by many other aio workload.
But that also means it has to be a broader discussion.

Kernel default:
fs/aio.c:194:unsigned long aio_max_nr = 0x10000; /* system wide maximum number of aio requests */

Options:
1. minor kernel patch as ubuntu delta
2. a file in /etc/sysctl.d/ in package procps
3. a file in /etc/sysctl.d/ in another package
4. deny

As it is a kernel limit and a kernel default value I'll add a kernel and a procps task for discussion with kernel and foundations Team. They also might have experience on those "limit raising" cases and how they were handled.

Related docs:
https://www.kernel.org/doc/Documentation/sysctl/fs.txt
http://man7.org/linux/man-pages/man2/io_setup.2.html

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

OTOH I wonder to some extend how you exceded the 65k with "just" 18 guests.
Not strictly required, but it might be very interesting before your system becomes unavailable to track:
  $ sysctl fs.aio-nr
while starting the guests. How much get added per guest, would the settle down after the guest is done with initial rampup.

Changed in kvm (Ubuntu):
status: New → Confirmed
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Confirmed
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-09-14 10:08 EDT-------
(In reply to comment #17)
> OTOH I wonder to some extend how you exceded the 65k with "just" 18 guests.
> Not strictly required, but it might be very interesting before your system
> becomes unavailable to track:
> $ sysctl fs.aio-nr
> while starting the guests. How much get added per guest, would the settle
> down after the guest is done with initial rampup.

Thank you all for investigating this issue...

Unfortunately, my Ubuntu KVM hosts have been shutdown temporarily to use those LPARs for some other testing. I'll provide the fs-aio-nr data just as soon as those resources free up again.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Any updates? Have you been able to monitor fs.aio-nr exhaustion?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-10-16 11:27 EDT-------
(In reply to comment #19)
> Any updates? Have you been able to monitor fs.aio-nr exhaustion?

Sorry, no. Is this data critical for debug? I would have to disrupt our current configuration in order to bring up the Ubuntu KVM host. If so, I'll see what I can do. Thanks...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yeah, such a change needs some reasoning and since (at least to me) it is not yet clear how such few guests consume so much aio resources we need to check that.

If you can provide:
1. the increase per guest spawned
2. are all your guests having more or less the same xml or do some have more disks or more iothreads?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-10-17 07:09 EDT-------
FWIW, I checked on my system and the aio-nr only increases for disks with
io='native' (libvirt) or aio='native' (qemu command line)

QEMU does an io_setup of 128 events per disk. But with current kernels this increases this by 2048 on my system.

Looks like we need

commit 2a8a98673c13cb2a61a6476153acf8344adfa992
Author: Mauricio Faria de Oliveira <email address hidden>
AuthorDate: Wed Jul 5 10:53:16 2017 -0300
Commit: Benjamin LaHaise <email address hidden>
CommitDate: Thu Sep 7 12:28:28 2017 -0400

fs: aio: fix the increment of aio-nr and counting against aio-max-nr

to fix the accounting.

This will still result in a limitation of 64k/128 = 512 disk for ALL guests if nothing else uses aio contexts.
Since aio context do not preallocate any ressources What about
- applying above fix
- increase to 128k to have enough capacity.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for checking cborntra!

The referred patch is in 4.14-rc1.
I'm not so sure on backports of this, but at least for the Artful and HWE kernel of 4.13 this would be good to have.

@Kernel Team - could you take a look and consider 2a8a9867 for our 4.13 kernels?

@Xnox - changing the global default is a Foundations thing, would you support the 128k change and would you ask in the Team for opinions?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'll build an Artful test kernel with commit 2a8a98673c13 and post it shortly.

Changed in linux (Ubuntu):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Actually, Artful already has commit 2a8a98673c13. It was pulled in for bug 1718397. It looks like Xenial and Zesty will also get that commit via that bug.

Can you test Artful and/or X/Y -proposed to see if this bug is resolved with that commit?

Changed in linux (Ubuntu Zesty):
assignee: nobody → Joseph Salisbury (jsalisbury)
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
importance: Undecided → Medium
status: New → In Progress
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Confirmed → In Progress
Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (10.0 KiB)

------- Comment From <email address hidden> 2017-10-17 18:34 EDT-------
Hi folks.

Good news! We got a test window on the Ubuntu KVM host today.

We provisioned a collection of 24 new virtual Ubuntu guests for this test. Each virtual domain uses a single qcow2 virtual boot volume. All guests are configured exactly the same (except guests zs93kag100080, zs93kag100081 and zs93kag100082 are on a macvtap interface. Otherwise, identical.).

Here's a sample of one (running) guest's XML:

ubuntu@zm93k8:/home/scottg$ virsh dumpxml zs93kag100080
<domain type='kvm' id='65'>
<name>zs93kag100080</name>
<uuid>6bd4ebad-414b-4e1e-9995-7d061331ec01</uuid>
<memory unit='KiB'>4194304</memory>
<currentMemory unit='KiB'>4194304</currentMemory>
<vcpu placement='static'>2</vcpu>
<resource>
<partition>/machine</partition>
</resource>
<os>
<type arch='s390x' machine='s390-ccw-virtio-xenial'>hvm</type>
</os>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>preserve</on_crash>
<devices>
<emulator>/usr/bin/qemu-kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none' io='native'/>
<source file='/guestimages/data1/zs93kag100080.qcow2'/>
<backingStore type='file' index='1'>
<format type='raw'/>
<source file='/rawimages/ubu1604qcow2/ubuntu.1604-1.20161206.v1.raw.backing'/>
<backingStore/>
</backingStore>
<target dev='vda' bus='virtio'/>
<boot order='1'/>
<alias name='virtio-disk0'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0000'/>
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native'/>
<source file='/guestimages/data1/zs93kag100080.prm'/>
<backingStore/>
<target dev='vdc' bus='virtio'/>
<alias name='virtio-disk2'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0006'/>
</disk>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<backingStore/>
<target dev='sda' bus='scsi'/>
<readonly/>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='usb' index='0' model='none'>
<alias name='usb'/>
</controller>
<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0002'/>
</controller>
<interface type='bridge'>
<mac address='02:00:00:00:40:80'/>
<source bridge='ovsbridge1'/>
<vlan>
<tag id='1297'/>
</vlan>
<virtualport type='openvswitch'>
<parameters interfaceid='cd58c548-0b1f-47e7-9ed5-ad4a1bc8b8e0'/>
</virtualport>
<target dev='vnet0'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/>
</interface>
<console type='pty' tty='/dev/pts/3'>
<source path='/dev/pts/3'/>
<target type='sclp' port='0'/>
<alias name='console0'/>
</console>
<memballoon model='none'>
<alias name='balloon0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='apparmor' relabel='yes'>
<label>libvirt-6bd4ebad-414b-4e1e-9995-7d061331ec01</label>
<imagelabel>libvirt-6bd4ebad-414b-4e1e-9995-7d061331ec01</imagelabel>
</seclabel>
</domain>

To set up the test, we shutdown all virtual domains, and then ran a script which simply starts the guests, one at a tim...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Commit 2a8a98673c13 is in Xenial -proposed, kernel version: Ubuntu-4.4.0-98.121. It looks like you tested with version 4.4.0-62-generic. Can you try testing with the -proposed Xenial kernel:

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.

Another option is to test with the current -updates Artful kernel.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-11-09 11:30 EDT-------
(In reply to comment #27)
> Commit 2a8a98673c13 is in Xenial -proposed, kernel version:
> Ubuntu-4.4.0-98.121. It looks like you tested with version
> 4.4.0-62-generic. Can you try testing with the -proposed Xenial kernel:
>
> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> enable and use -proposed.
>
> Another option is to test with the current -updates Artful kernel.

Heintz - I should be able to get time on the Ubuntu test system soon. I'll try the proposed kernel level and report results ASAP. Thanks... - Scott

Revision history for this message
bugproxy (bugproxy) wrote : var/log/syslog collected Aug 7 2017

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New dmesg output captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New sosreport captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New syslog captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-11-09 16:56 EDT-------
Heinz,

I'm trying to pick up the kernel update using the -proposed method at the URL you provided.

I added this line to the sources.list ...

root@zm93k8:/home/scottg# grep proposed /etc/apt/sources.list
deb http://ports.ubuntu.com/ubuntu-ports xenial-proposed restricted main multiverse universe

Created proposed-updates file ...

root@zm93k8:/home/scottg# cat /etc/apt/preferences.d/proposed-updates
Package: *
Pin: release a=xenial-proposed
Pin-Priority: 400

The simulated update doesn't show any available updates (as expected / desired):

root@zm93k8:/home/scottg# sudo apt-get upgrade -s
Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

However, I can't figure out how to actually install the specific kernel package ... I tried a simulation and I get:

root@zm93k8:/home/scottg# sudo apt-get install Ubuntu-4.4.0-98.121/xenial-proposed -s
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package Ubuntu-4.4.0-98.121
E: Couldn't find any package by glob 'Ubuntu-4.4.0-98.121'
E: Couldn't find any package by regex 'Ubuntu-4.4.0-98.121'
root@zm93k8:/home/scottg#

Do I have incorrect syntax? Incorrect package name? Not sure why I'm not finding the update.

Note, I also had to comment out these URLs to prevent from finding updates:

#deb http://ports.ubuntu.com/ xenial main restricted universe multiverse
#deb http://ports.ubuntu.com/ xenial-updates main restricted universe multiverse
#deb http://ports.ubuntu.com/ xenial-security main universe

This is unfamiliar territory... thanks for your help.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Scott,
the howto is mixed for Desktop users, Server users and selective upgrades.
For your case you only need the most simple case which would be:

Essentially you want to:

# Check - all other updates done (to clear the view)
$ apt list --upgradable
Listing... Done

# Enable proposed for z on Server
$ echo "deb http://ports.ubuntu.com/ubuntu-ports/ xenial-proposed main restricted universe multiverse" | sudo tee /etc/apt/sources.list.d/enable-proposed.list
$ sudo apt update
$ apt list --upgradable
[...]
linux-headers-generic/xenial-proposed 4.4.0.100.105 s390x [upgradable from: 4.4.0.98.103]
linux-headers-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from: 4.4.0.98.103]
linux-image-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from: 4.4.0.98.103]

# Install just the kernels from proposed
$ sudo apt install linux-generic

No need to set apt prefs if you only do a selective install.
If you'd do a global "sudo apt upgrade" you'd get all, but that is likely not what you want in your case. After you have done so you can just enable/disable the line in /etc/apt/sources.list.d/enable-proposed.list as needed.

Hope that helps

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (6.3 KiB)

------- Comment From <email address hidden> 2017-11-10 14:39 EDT-------
(In reply to comment #31)
> Hi Scott,
> the howto is mixed for Desktop users, Server users and selective upgrades.
> For your case you only need the most simple case which would be:
>
> Essentially you want to:
>
> # Check - all other updates done (to clear the view)
> $ apt list --upgradable
> Listing... Done
>
> # Enable proposed for z on Server
> $ echo "deb http://ports.ubuntu.com/ubuntu-ports/ xenial-proposed main
> restricted universe multiverse" | sudo tee
> /etc/apt/sources.list.d/enable-proposed.list
> $ sudo apt update
> $ apt list --upgradable
> [...]
> linux-headers-generic/xenial-proposed 4.4.0.100.105 s390x [upgradable from:
> 4.4.0.98.103]
> linux-headers-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from:
> 4.4.0.98.103]
> linux-image-virtual/xenial-proposed 4.4.0.100.105 s390x [upgradable from:
> 4.4.0.98.103]
>
> # Install just the kernels from proposed
> $ sudo apt install linux-generic
>
> No need to set apt prefs if you only do a selective install.
> If you'd do a global "sudo apt upgrade" you'd get all, but that is likely
> not what you want in your case. After you have done so you can just
> enable/disable the line in /etc/apt/sources.list.d/enable-proposed.list as
> needed.
>
> Hope that helps

Yes, your instructions were immensely useful, thanks for the explanation.

With the proposed fix applied, I am now able to start over 100 virtual guests, even with aio-max-nr set to 64K:

root@zm93k8:~# cat /proc/sys/fs/aio-max-nr
65535

root@zm93k8:/tmp# virsh list |grep running
86 zs93kag70041 running
87 zs93kag70042 running
88 zs93kag70055 running
89 zs93kag70056 running
90 zs93kag70057 running
91 zs93kag70058 running
92 zs93kag70059 running
93 zs93kag70060 running
94 zs93kag70061 running
95 zs93kag70062 running
96 zs93kag70063 running
97 zs93kag70064 running
98 zs93kag70065 running
99 zs93kag70066 running
100 zs93kag70067 running
101 zs93kag70068 running
102 zs93kag70069 running
103 zs93kag70070 running
104 zs93kag70071 running
105 zs93kag70072 running
106 zs93kag70073 running
107 zs93kag70074 running
108 zs93kag70075 running
109 zs93kag70077 running
110 zs93kag70078 running
111 zs93kag70079 running
112 zs93kag70080 running
113 zs93kag70081 running
114 zs93kag70082 running
115 zs93kag70083 running
116 zs93kag70084 running
117 zs93kag70085 running
118 zs93kag70086 running
119 zs93kag70087 running
120 zs93kag70088 running
121 zs93kag70089 ...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : New sosreport captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New syslog captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-01-26 03:21 EDT-------
Canonical, any updates available for this LP ?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Heinz-Werner,
thanks for the ping.

I think we agreed back then that the config itself is a Ubuntu wide thing - but I haven't seen any discussions on procps or in general to lift it. But then as I outlined in comment #7 any limit can be too low. I'll set the KVM task to won't fix to clearly mark it as "not a kvm task".

The fix that you need thou is the kernel change to account correctly - so that part of the question goes to the Kernel Team if the change that was proposed back then was actually released.
@Joseph - we clearly passed the number, but was the change released with it?

Changed in kvm (Ubuntu):
status: Confirmed → Won't Fix
no longer affects: kvm (Ubuntu Xenial)
no longer affects: kvm (Ubuntu Zesty)
no longer affects: kvm (Ubuntu Artful)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Commit 2a8a98673c13 has been in Xenial since Ubuntu-4.4.0-98.121. If you can confirm the bug is now fixed since this kernel version, I'll mark the bug as Fix Released.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → Medium
Revision history for this message
bugproxy (bugproxy) wrote : dmesg output from zm93k8

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : sosreport captured Aug 7 2017

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : var/log/syslog collected Aug 7 2017

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New dmesg output captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New sosreport captured Aug 23

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : New syslog captured Aug 23

Default Comment by Bridge

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

zesty is End Of Life

Changed in linux (Ubuntu Zesty):
status: In Progress → Won't Fix
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Zesty is end of life.

Changed in procps (Ubuntu Zesty):
status: New → Won't Fix
Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: In Progress → Fix Released
no longer affects: linux (Ubuntu Zesty)
no longer affects: procps (Ubuntu Zesty)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Commit 2a8a98673c13 has been in Xenial and Artful for some time now. Can you confirm this bug is now resolved with the latest updates.

Changed in linux (Ubuntu Artful):
status: In Progress → Fix Released
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-02-27 04:03 EDT-------
I can confirm that this is fixed in
4.13.0-36-generic
and
4.4.0-116-generic

------- Comment From <email address hidden> 2018-02-27 04:04 EDT-------
Tested HWE and LTS kernels, both contain this fix.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: In Progress → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-02-27 05:21 EDT-------
IBM bugzilla status -> closed, Fix Released and verified

Revision history for this message
Balint Reczey (rbalint) wrote :

@xnox markig procps as wontfix since the kernel fix seems to be sufficient.

Changed in procps (Ubuntu):
status: New → Won't Fix
Changed in procps (Ubuntu Xenial):
status: New → Won't Fix
Changed in procps (Ubuntu Artful):
status: New → Won't Fix
To post a comment you must log in.