Ubuntu Openstack Mitaka can only deploy up to 52 instances
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Invalid
|
Critical
|
Billy Olsen | ||
openstack (Ubuntu) |
Invalid
|
Undecided
|
Skipper Bug Screeners |
Bug Description
Ubuntu Openstack Mitaka can only deployment 52 instances on Cinder LVM volumes
---uname output---
Linux ub01 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:14:35 UTC 2016 s390x s390x s390x GNU/Linux
Machine Type = LinuxONE
---Debugger---
A debugger is not configured
---Steps to Reproduce---
Deploy instances via horizon portal.
1. Select to deploy to volume
2. Specify volume size and select "yes" for deleting volume when delete instance.
3. After deploy up to 52 instances, we get error messages for the 53th deployment. And that deployment failure.
bugproxy (bugproxy) wrote : Detailed Problem Description | #1 |
tags: | added: architecture-s39064 bugnameltc-149089 severity-high targetmilestone-inin--- |
Changed in ubuntu: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
affects: | ubuntu → openstack (Ubuntu) |
tags: |
added: targetmilestone-inin16041 removed: targetmilestone-inin--- |
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #2 |
------- Comment From <email address hidden> 2016-11-28 09:37 EDT-------
I found something new today. Just for your reference.
When the deployment gets failure. I can see below info in the syslog. It shows io_setup failed from multipathd.
Nov 27 22:24:31 ub01 kernel: [561221.664513] scsi host153: iSCSI Initiator over TCP/IP
Nov 27 22:24:32 ub01 kernel: [561222.677294] scsi 153:0:0:0: RAID IET Controller 0001 PQ: 0 ANSI: 5
Nov 27 22:24:32 ub01 kernel: [561222.677684] scsi 153:0:0:0: Attached scsi generic sg142 type 12
Nov 27 22:24:32 ub01 kernel: [561222.678596] scsi 153:0:0:1: Direct-Access IET VIRTUAL-DISK 0001 PQ: 0 ANSI: 5
Nov 27 22:24:32 ub01 kernel: [561222.679067] sd 153:0:0:1: Attached scsi generic sg143 type 0
Nov 27 22:24:32 ub01 kernel: [561222.679105] sd 153:0:0:1: [sdcm] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
Nov 27 22:24:32 ub01 kernel: [561222.679305] sd 153:0:0:1: [sdcm] Write Protect is off
Nov 27 22:24:32 ub01 kernel: [561222.679308] sd 153:0:0:1: [sdcm] Mode Sense: 69 00 10 08
Nov 27 22:24:32 ub01 kernel: [561222.679393] sd 153:0:0:1: [sdcm] Write cache: enabled, read cache: enabled, supports DPO and FUA
Nov 27 22:24:32 ub01 kernel: [561222.680868] sdcm: sdcm1
Nov 27 22:24:32 ub01 kernel: [561222.681803] sd 153:0:0:1: [sdcm] Attached SCSI disk
Nov 27 22:24:32 ub01 multipathd[179286]: sdcm: add path (uevent)
Nov 27 22:24:32 ub01 multipathd[179286]: io_setup failed
Nov 27 22:24:32 ub01 multipathd[179286]: uevent trigger error
Nov 27 22:24:32 ub01 systemd-
Nov 27 22:24:33 ub01 systemd-
Nov 27 22:24:33 ub01 systemd-
Nov 27 22:24:33 ub01 iscsid: Connection150:0 to [target: iqn.2010-
Nov 27 22:24:33 ub01 kernel: [561223.395321] IPv6: ADDRCONF(
Nov 27 22:24:33 ub01 kernel: [561223.465740] device qvbb21abf90-f6 entered promiscuous mode
Nov 27 22:24:33 ub01 kernel: [561223.613720] IPv6: ADDRCONF(
Nov 27 22:24:33 ub01 kernel: [561223.687587] device qvob21abf90-f6 entered promiscuous mode
Nov 27 22:24:33 ub01 kernel: [561223.919008] qbrb21abf90-f6: port 1(qvbb21abf90-f6) entered forwarding state
Nov 27 22:24:33 ub01 kernel: [561223.919014] qbrb21abf90-f6: port 1(qvbb21abf90-f6) entered forwarding state
Nov 27 22:24:33 ub01 ovs-vsctl: ovs|00001|
And from the multipath -ll output. I see some below messages.
Nov 28 09:20:4...
tags: | added: openstack-ibm |
tags: | added: s390x uosci |
Ryan Beisner (1chb1n) wrote : | #4 |
Can we please get juju status output from this deployment? That, plus the bundle used to deploy, will be helpful in understanding which applications are where, and how they are configured. Thank you.
Frank Heimes (fheimes) wrote : | #5 |
... and please also the status about the disk utilization before and after you hit that situation?
Thx
bugproxy (bugproxy) wrote : | #6 |
------- Comment From <email address hidden> 2016-11-29 20:16 EDT-------
(In reply to comment #10)
> Can we please get juju status output from this deployment? That, plus the
> bundle used to deploy, will be helpful in understanding which services are
> where, and how they are configured. Thank you.
>
The openstack is not installed via juju. We manually installed the openstack components.
> Can we please get juju status output from this deployment? That, plus the
> bundle used to deploy, will be helpful in understanding which applications
> are where, and how they are configured. Thank you.
>
> ... and please also the status about the disk utilization before and after
> you hit that situation?
> Thx
Before I hit the situation, below is the disk utilization. cloudvg is the LVM for cinder.
VG #PV #LV #SN Attr VSize VFree
cloudvg 10 48 0 wz--n- 10.00t 8.83t
ub01-vg 2 2 0 wz--n- 199.52g 1020.00m
When I get the problem, below is the disk utilization.
VG #PV #LV #SN Attr VSize VFree
cloudvg 10 57 0 wz--n- 10.00t 8.65t
ub01-vg 2 2 0 wz--n- 199.52g 1020.00m
bugproxy (bugproxy) wrote : | #7 |
------- Comment From <email address hidden> 2016-11-30 04:17 EDT-------
Raised Priority to "Block" to highlight the urgency!
tags: |
added: severity-critical removed: severity-high |
Changed in ubuntu-z-systems: | |
status: | New → Triaged |
importance: | Undecided → Critical |
assignee: | nobody → Ryan Beisner (1chb1n) |
Ryan Beisner (1chb1n) wrote : | #8 |
Before I can do much with this, I need more information.
How were the applications/
What are the specs of the machine(s)?
Logs from api and compute services should be indicative. Are those available or can those be analyzed, then post relevant errors and tracebacks?
Thank you.
Changed in ubuntu-z-systems: | |
status: | Triaged → Incomplete |
Changed in openstack (Ubuntu): | |
status: | New → Incomplete |
Ryan Beisner (1chb1n) wrote : | #9 |
Apologies, I missed the comment which stated that the deployment was done manually without Juju. Unfortunately, that will make it very difficult to try to reproduce, and reinforces the need for logs to triage.
bugproxy (bugproxy) wrote : cinder log | #10 |
- cinder log Edit (20.6 KiB, application/x-tar)
------- Comment (attachment only) From <email address hidden> 2016-11-30 08:38 EDT-------
bugproxy (bugproxy) wrote : nova log | #11 |
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #12 |
------- Comment From <email address hidden> 2016-11-30 08:49 EDT-------
(In reply to comment #14)
> Before I can do much with this, I need more information.
>
> How were the applications/
> and Juju status output, sanitized.
>
> What are the specs of the machine(s)?
>
> Logs from api and compute services should be indicative. Are those
> available or can those be analyzed, then post relevant errors and tracebacks?
>
> Thank you.
I have uploaded the log files. Please check the volume id b794c66f-
bugproxy (bugproxy) wrote : syslog | #13 |
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #14 |
------- Comment From <email address hidden> 2016-11-30 08:58 EDT-------
Some info for your reference.
When the deployment failed, I noticed that the volume had been created successfully and attached to the instance. But somehow the VM cannot be booted due to the disk issue and detached from the instance. From that volume (20GB), I saw a 2.2GB partition was created, which is not supposed to be. It should be a 20GB partition. After the failure, I manually created the iscsi target and connected to an existing Ubuntu VM. I can see the file contents on this 2.2GB partition.
bugproxy (bugproxy) wrote : cinder log | #15 |
- cinder log Edit (20.6 KiB, application/x-tar)
------- Comment (attachment only) From <email address hidden> 2016-11-30 08:38 EDT-------
bugproxy (bugproxy) wrote : nova log | #16 |
bugproxy (bugproxy) wrote : syslog | #17 |
Ryan Beisner (1chb1n) wrote : | #18 |
Thanks for the logs. Nova conductor is reporting "NoValidHost: No valid host was found. There are not enough hosts available." when attempting to schedule and spawn an instance. That is a somewhat generic failure message. But the specific failure seems to be a backing storage issue:
2016-11-
...
2016-11-23 03:17:54.654 6858 WARNING nova.scheduler.
2016-11-23 03:17:58.965 6833 ERROR nova.scheduler.
2016-11-23 03:17:59.011 6833 WARNING nova.scheduler.
Traceback (most recent call last):
File "/usr/lib/
return func(*args, **kwargs)
File "/usr/lib/
dests = self.driver.
File "/usr/lib/
raise exception.
NoValidHost: No valid host was found. There are not enough hosts available.
Ryan Beisner (1chb1n) wrote : | #19 |
I wonder if that is caused by one of the quota/limits in place. One thing to check would be Cinder. It has a default of 1000MB for maxTotalVolumeG
$ cinder absolute-limits
+------
| Name | Value |
+------
| maxTotalBackupG
| maxTotalBackups | 10 |
| maxTotalSnapshots | 10 |
| maxTotalVolumeG
| maxTotalVolumes | 10 |
| totalBackupGiga
| totalBackupsUsed | 0 |
| totalGigabytesUsed | 0 |
| totalSnapshotsUsed | 0 |
| totalVolumesUsed | 0 |
+------
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #20 |
------- Comment From <email address hidden> 2016-11-30 18:34 EDT-------
(In reply to comment #21)
> Thanks for the logs. Nova conductor is reporting "NoValidHost: No valid
> host was found. There are not enough hosts available." when attempting to
> schedule and spawn an instance. That is a somewhat generic failure message.
> But the specific failure seems to be a backing storage issue:
>
> 2016-11-
> file=/dev/
> volume-
> id=drive-
> aio=native: The device is not writable: Bad file descriptor
>
> ...
>
> 2016-11-23 03:17:54.654 6858 WARNING nova.scheduler.
> [req-ed481e89-
> 63958815625d410
> 89240ffa-
> 2016-11-23 03:17:58.965 6833 ERROR nova.scheduler.
> [req-ed481e89-
> 63958815625d410
> 0cb365ea-
> ub01.marist.edu): [u'Traceback (most recent call last):\n', u' File
> "/usr/lib/
> _do_build_
> "/usr/lib/
> _build_
> reason=
> 0cb365ea-
> process exited while connecting to monitor: 2016-11-
> qemu-system-s390x: -drive
> file=/dev/
> volume-
> id=drive-
> aio=native: The device is not writable: Bad file descriptor\n']
> 2016-11-23 03:17:59.011 6833 WARNING nova.scheduler.
> [req-ed481e89-
> 63958815625d410
> compute_
> hosts available.
> Traceback (most recent call last):
>
> File "/usr/lib/
> 150, in inner
> return func(*args, **kwargs)
>
> File "/usr/lib/
> in select_destinations
> dests = self.driver.
>
> File "/usr/lib/
> line 74, in select_destinations
> raise exception.
>
> NoValidHost: No valid host was found. There are not enough hosts available.
>
> I wonder if that is caused by one of the quota/limits in place. One thing
> to check would be Cinder. It has a default of 1000MB for
> maxTotalVolumeG
>
> $ cinder abs...
Frank Heimes (fheimes) wrote : | #21 |
Just as a side note: The background for asking about the quota was because this issue may have something to do with crossing some kind of a 1 TB boundary.
Given the fact that storage in partly calculated using different factors (e.g. 1000 or 1024) and 51 * 20480MB is surprisingly close to 1TB ...
Billy Olsen (billy-olsen) wrote : | #22 |
As Ryan indicated in his previous comment, the reason that the instances are unable to launch is due to the libvirt error:
libvirtError: internal error: process exited while connecting to monitor: 2016-11-
This indicates that the block device mapped by the file path /dev/disk/by-path ... has a bad file descriptor. Bad file descriptor suggests to me either that the device came and went away, the device is (for some reason) visible as read-only, or that the device hasn't yet been fully attached. This error messages is provided by the qemu code and only logged when checking to ensure the block-device's file descriptor is writable.
I think it'd be useful to get a bit more data from the compute and cinder nodes. Can you collect a bunch of data regarding the system using a tool called sosreport (in the xenial archives)? It will collect various logs, metrics, and system configuration which is useful to perform diagnostics. Just make sure to run the sosreport tool with the -a command to ensure that all of the information is captured (more efficient to get it the first go 'round).
Before collecting the sosreport, it probably makes sense to increase the debug logging for the nova and qemu services prior to collecting the data. Setting logging levels for both to debug would provide lots of useful information.
Also note: the 2.2 GB partition isn't an issue AIUI. The volume is created by:
1. downloading the image from glance
2. expanding and writing the block-level content to the cinder volume (where content is expanded from 320 MB to 2.2 GB).
3. When cloud-init runs on startup of the VM, the code detects the underlying disk is bigger than what is currently seen and will attempt to expand the partition and filesystem to consume the full content of the disk.
bugproxy (bugproxy) wrote : | #23 |
------- Comment From <email address hidden> 2016-12-01 21:35 EDT-------
(In reply to comment #24)
Thank you for looking into this. Please see my comments below.
> As Ryan indicated in his previous comment, the reason that the instances are
> unable to launch is due to the libvirt error:
>
> libvirtError: internal error: process exited while connecting to monitor:
> 2016-11-
> file=/dev/
> volume-
> id=drive-
> aio=native: The device is not writable: Bad file descriptor
>
> This indicates that the block device mapped by the file path
> /dev/disk/by-path ... has a bad file descriptor. Bad file descriptor
> suggests to me either that the device came and went away, the device is (for
> some reason) visible as read-only, or that the device hasn't yet been fully
> attached. This error messages is provided by the qemu code and only logged
> when checking to ensure the block-device's file descriptor is writable.
>
Right.. As I mentioned in the previous reply, the volume was created and attached to the instance. And then detached when it got this error.. It happened during the instance boot up after the volume was attached to.
> I think it'd be useful to get a bit more data from the compute and cinder
> nodes. Can you collect a bunch of data regarding the system using a tool
> called sosreport (in the xenial archives)? It will collect various logs,
> metrics, and system configuration which is useful to perform diagnostics.
> Just make sure to run the sosreport tool with the -a command to ensure that
> all of the information is captured (more efficient to get it the first go
> 'round).
>
> Before collecting the sosreport, it probably makes sense to increase the
> debug logging for the nova and qemu services prior to collecting the data.
> Setting logging levels for both to debug would provide lots of useful
> information.
>
> Also note: the 2.2 GB partition isn't an issue AIUI. The volume is created
> by:
> 1. downloading the image from glance
> 2. expanding and writing the block-level content to the cinder volume (where
> content is expanded from 320 MB to 2.2 GB).
> 3. When cloud-init runs on startup of the VM, the code detects the
> underlying disk is bigger than what is currently seen and will attempt to
> expand the partition and filesystem to consume the full content of the disk.
Understood.. I manually checked the volume after the deployment failed. The contents on 2.2GB partition has a full ubuntu linux file structure. So just don't know why it cannot be booted up during that time.
I have attached the sos file.
bugproxy (bugproxy) wrote : sosreport-part-aa | #24 |
- sosreport-part-aa Edit (20.0 MiB, application/octet-stream)
------- Comment on attachment From <email address hidden> 2016-12-01 21:56 EDT-------
As the file is too big. I used the split command to divide it into parts.
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #25 |
------- Comment From <email address hidden> 2016-12-01 22:02 EDT-------
Comment on attachment 114452
sosreport-part-aa
Please ignore this as it is still too big..
bugproxy (bugproxy) wrote : sosreport-part-01 | #26 |
- sosreport-part-01 Edit (8.0 MiB, application/octet-stream)
------- Comment on attachment From <email address hidden> 2016-12-01 22:06 EDT-------
using split command to split the sosreport file.
bugproxy (bugproxy) wrote : sosreport-part-02 | #27 |
- sosreport-part-02 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:07 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-03 | #28 |
- sosreport-part-03 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:08 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-04 | #29 |
- sosreport-part-04 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:08 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-05 | #30 |
- sosreport-part-05 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:09 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-06 | #31 |
- sosreport-part-06 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:10 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-07 | #32 |
- sosreport-part-07 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:10 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-08 | #33 |
- sosreport-part-08 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:11 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-09 | #34 |
- sosreport-part-09 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:12 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-10 | #35 |
- sosreport-part-10 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:13 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-11 | #36 |
- sosreport-part-11 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:14 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-12 | #37 |
- sosreport-part-12 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:14 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-13 | #38 |
- sosreport-part-13 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:15 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-14 | #39 |
- sosreport-part-14 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:16 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-15 | #40 |
- sosreport-part-15 Edit (3.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:16 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-05 | #41 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:09 EDT-------
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #42 |
------- Comment From <email address hidden> 2016-12-15 12:23 EDT-------
An observation:
Having Cinder and Nova-compute on the same host, and using LVM and iSCSI backends might be causing some race conditions in multipath.
The syslog is littered with messages like this:
Nov 23 03:17:47 ub01 multipathd[179286]: io_setup failed
Nov 23 03:17:47 ub01 multipathd[179286]: uevent trigger error
Nov 23 03:17:47 ub01 systemd[188085]: dev-disk-
Nov 23 03:17:47 ub01 systemd[1]: dev-disk-
Nov 23 03:17:47 ub01 systemd[188085]: dev-disk-
Nov 23 03:17:47 ub01 systemd[1]: dev-disk-
No idea if this is the reason you're seeing the problem or not..
Changed in ubuntu-z-systems: | |
assignee: | Ryan Beisner (1chb1n) → nobody |
Changed in ubuntu-z-systems: | |
assignee: | nobody → Billy Olsen (billy-olsen) |
Billy Olsen (billy-olsen) wrote : | #43 |
I too noticed the:
Nov 23 03:17:47 ub01 multipathd[179286]: io_setup failed
Nov 23 03:17:47 ub01 multipathd[179286]: uevent trigger error
Errors - and they continue to persist in recent recreate attempts. I think it would be good to configure multipath such that the devices provided by Cinder are excluded. Likely the best approach would be a general blacklist of all devices and provide exclusions for the known backends that you have for other storage devices.
bugproxy (bugproxy) wrote : sosreport-part-05 | #44 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:09 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-06 | #45 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:10 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-07 | #46 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:10 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-08 | #47 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:11 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-09 | #48 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:12 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-10 | #49 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:13 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-11 | #50 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:14 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-12 | #51 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:14 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-13 | #52 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:15 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-14 | #53 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:16 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-15 | #54 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:16 EDT-------
Billy Olsen (billy-olsen) wrote : | #55 |
As I'm looking at the data some more, I see in the sos reports provided that there may be heavy pressure for the async I/O and that may actually be what's causing the problems here.
In the syslog, I see the following events getting reported from multipathd soon after the the device is discovered (its part of the udev event hooks I believe):
Jan 12 22:02:49 ub01 multipathd[179286]: sdch: add path (uevent)
Jan 12 22:02:49 ub01 multipathd[179286]: io_setup failed
Jan 12 22:02:49 ub01 multipathd[179286]: uevent trigger error
The io_setup is a system call into the kernel which can fail for a few different reasons, but one of the reasons it will fail is if the number of async io requests are greater than the current amount. It will error out with an EAGAIN, but there's explicit documentation in the multipath that one of the reasons that the io_setup may fail in this case is due to the settings of the fs.aio-nr-max, which is essentially the maximum number of events for all currently active aio contexts.
Checking the value collected I see these values:
# Quick check on the box today
fs.aio-max-nr = 65536
fs.aio-nr = 124928
# From the sosreport provided on 12/01
fs.aio-max-nr = 65536
fs.aio-nr = 131072
As you can see the aio-nr is well above the aio-max-nr. I'm not entirely sure what is causing this, but there are a large number of async io events queued up.
I think the plan of action should be to:
a) configure multipath to ignore the iscsi devices presented between cinder and nova
b) cleanup any errors in the current paths used by multipath, so that we know that disk i/o to the actul backend is not an issue
c) Possibly consider increasing the fs.aio-max-nr
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #56 |
------- Comment From <email address hidden> 2017-01-16 14:38 EDT-------
Thanks for looking into this!
Regarding the multipath error, I changed the multipath configurations to put Linux instance volumes in multipath blacklist.. Then I don't see the multipath error. And I was able to deploy couple more instances successfully in this environment. However, unfortunately, when I tried to deploy more, I got the same problem. The message shows the disk is not writable during the deployment.. And I noticed that the async io is still very high (see below number.)
fs.aio-max-nr = 65536
fs.aio-nr = 131072
Then I also increased the air-max-nr to a bigger number. And I can deploy more instances successfully.. But the deployment time is also getting much longer! I think it's related to the io waiting/
fs.aio-max-nr = 1048576
fs.aio-nr = 133120
So I'm thinking the problem may tie to the KVM/QEMU. Maybe the qemu couldn't handle the async io properly on s390x? Or need some changes in the qemu/libvirt configuration?
bugproxy (bugproxy) wrote : | #57 |
------- Comment From <email address hidden> 2017-01-17 06:46 EDT-------
KVM for IBM z uses the following sysctl defaults:
fs.inotify.
fs.aio-max-nr = 4194304
Can you try with these values?
bugproxy (bugproxy) wrote : sosreport-part-02 | #58 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:07 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-04 | #59 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:08 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-05 | #60 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:09 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-06 | #61 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:10 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-07 | #62 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:10 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-08 | #63 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:11 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-09 | #64 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:12 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-10 | #65 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:13 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-11 | #66 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:14 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-12 | #67 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:14 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-13 | #68 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:15 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-14 | #69 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:16 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-15 | #70 |
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:16 EDT-------
bugproxy (bugproxy) wrote : sosreport-part-12 | #71 |
- sosreport-part-12 Edit (8.0 MiB, application/octet-stream)
------- Comment (attachment only) From <email address hidden> 2016-12-01 22:14 EDT-------
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla | #72 |
------- Comment From <email address hidden> 2017-01-25 20:08 EDT-------
(In reply to comment #49)
> KVM for IBM z uses the following sysctl defaults:
>
> fs.inotify.
> fs.aio-max-nr = 4194304
>
> Can you try with these values?
Thank you for the suggestion!
Having set the system with these values, I can deploy way more 52 instances.. I tested up to 200+ instances.. As expected, the deployment time is increasing linearly. I think it's related to the asynchronized io increase in KVM hypervisor.
Dimitri John Ledkov (xnox) wrote : Re: [Bug 1644785] Comment bridged from LTC Bugzilla | #73 |
On 26 January 2017 at 01:17, bugproxy <email address hidden> wrote:
> ------- Comment From <email address hidden> 2017-01-25 20:08 EDT-------
> (In reply to comment #49)
>> KVM for IBM z uses the following sysctl defaults:
>>
>> fs.inotify.
>> fs.aio-max-nr = 4194304
>>
>> Can you try with these values?
>
> Thank you for the suggestion!
>
> Having set the system with these values, I can deploy way more 52
> instances.. I tested up to 200+ instances.. As expected, the deployment
> time is increasing linearly. I think it's related to the asynchronized
> io increase in KVM hypervisor.
>
Should these sysctl defaults be shipped in Ubuntu by default for s390x?
On my amd64 xenial desktop my max_user_watches is a lot higher (~524k
vs ~8k) than what I see on s390x zesty;
fs.aio-max-nr is set to ~65k in both. Why is the default so low or
(vice versa) why such a high number is required?
--
Regards,
Dimitri.
Frank Heimes (fheimes) wrote : | #74 |
... after discussing with IBM
Changed in openstack (Ubuntu): | |
status: | Incomplete → Invalid |
Changed in ubuntu-z-systems: | |
status: | Incomplete → Invalid |
Default Comment by Bridge