aarch64: logfile not supported in this QEMU binary

Bug #1697610 reported by ChristianEhrhardt on 2017-06-13
20
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Status tracked in Pike
Ocata
Medium
Unassigned
Pike
Medium
Unassigned
libvirt (Ubuntu)
Status tracked in Artful
Zesty
Medium
Unassigned
Artful
Medium
Unassigned
nova (Ubuntu)
Status tracked in Artful
Zesty
Medium
James Page
Artful
Medium
James Page

Bug Description

[Impact]
arm64 based openstack clouds can't boot instances with OpenStack Ocata or later.

[Test Case]
Deploy OpenStack
Boot instance
Instance fails to boot with "logfile not supported in the QEMU binary" error message

[Regression Potential]
Low; the proposed patch reverts to using the pre-ocata code path, skipping
use of virtlogd for arm based architectures.

[Original Bug Report]
This is a spin-off to bug 1673467 as it is a different issue:

Got this today via Mail, linking here:

none:

https://pastebin.canonical.com/190574/

host-model:

https://pastebin.canonical.com/190578/

host-passthrough

https://pastebin.canonical.com/190579/

@admcleod - While my system is preparing to test this I think the logs you added are already kind of proving that the issue this bug was reported about is kind of solved.

In regard to your logs - the related error:
none:
-> Passes the initialization but then breaks on logfile

host-model:
-> Fails due to host-model being broken

host-passthrough
-> Passes the initialization but then breaks on logfile

That said it seems to me the config overall is broken in regard to the logfile in some sort.
When host-model is selected it fails earlier on init (this is the actual bug that was discussed in comments #1-#20), if called without host-model the init goes on.
But then in general this seems to have issues around the logfile in some way.
"libvirtError: unsupported configuration: logfile not supported in this QEMU binary"

To reproduce I took the recommende "host-passthrough" case and made the following modifications to run without a real openstack around it:
#0 packages that drag in all dependencies
sudo apt install uvtool-libvirt nova-compute

#1 create nvram vars from template to match XML
sudo cp /usr/share/AAVMF/AAVMF_VARS.fd /var/lib/libvirt/qemu/nvram/instance-00000010_VARS.fd
sudo chown libvirt-qemu:kvm /var/lib/libvirt/qemu/nvram/instance-00000010_VARS.fd

#2 Replace openstack disks with something local that boots
wget https://cloud-images.ubuntu.com/zesty/current/zesty-server-cloudimg-arm64.img
<disk type='file' device='disk'>
  <driver name='qemu' type='raw'/>
  <source file='/home/ubuntu/zesty-server-cloudimg-arm64.img'/>
  <target dev='hdc' bus='virtio'/>
  <address type='virtio-mmio'/>
</disk>

#3 since we don't have the OS created net, replace with the default network
<interface type='network'>
  <mac address='52:54:00:af:8f:2f'/>
  <source network='default'/>
  <model type='virtio'/>
</interface>

#4 Create the logdir that nova specified in the "real" case
sudo mkdir /var/lib/nova/instances/5f488b37-8906-4006-b736-70860856f290/
sudo chown nova:nova /var/lib/nova/instances/5f488b37-8906-4006-b736-70860856f290/

With the above I was able to get your new bug around "logfile not supported in this QEMU binary".
Ok, that certainly is a different bug - I can switch between host-model (old issue) and host-passthrough and be good.
The logfile issue is a different one, so we track it in a new bug = Here.

ChristianEhrhardt (paelzer) wrote :

Query Qemu the way Libvirt does to detect QEMU_CAPS_CHARDEV_LOGFILE:

virsh qemu-monitor-command instance-00000010 --pretty '{ "execute": "query-command-line-options"}'

Has:
      "parameters": [
[...]
        {
          "name": "logfile",
          "type": "string"
        },
      ],
      "option": "chardev"
    },

So the chardev should have the logfile option as it always had IMHO.

ChristianEhrhardt (paelzer) wrote :

The tests in tests/qemucapabilitiesdata/ confirm that this should be available cross arch since >=2.6.
And as seen above it is, yet the error suggests that libvirt thinks it isn't and aborts then.

ChristianEhrhardt (paelzer) wrote :

In qemuBuildChrChardevStr which builds the "-chardev [..." command it checks the cap QEMU_CAPS_CHARDEV_LOGFILE which we have to assume to be true given the monitor checks I've made.

But in qemuBuildChrArgStr there is an unconditional fail with the same error message.
Just the existance of a logfile definition causes this - so the question is when is qemuBuildChrArgStr be used exactly.

Most code paths insist on the newer qemuBuildChrChardevStr, but there are three that can fall back to qemuBuildChrArgStr if they find no chardev support.
1. qemuBuildMonitorCommandLine - virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV)
2. qemuBuildSerialCommandLine - virQEMUCapsSupportsChardev(def, qemuCaps, serial)
3. qemuBuildParallelsCommandLine - virQEMUCapsGet(qemuCaps, QEMU_CAPS_CHARDEV)

Since QEMU_CAPS_CHARDEV is always enabled since a long time it might be in virQEMUCapsSupportsChardev related to the serial.

With that known I could already verify that the log= on the pty works just fine.
It is just the log= on the serial that fails.
@admcleod - could that be an artifact of enabling debugging everywhere and not a regression at all?

Changed in libvirt (Ubuntu):
status: New → Incomplete
ChristianEhrhardt (paelzer) wrote :

There are arm specific tweaks in virQEMUCapsSupportsChardev that can lead to "false" in that case it has to fallback to the older code which does not support log= attributes (and thereby the logfile argument).

Those cases are:
if ((def->os.arch != VIR_ARCH_ARMV7L) && (def->os.arch != VIR_ARCH_AARCH64))
    return true;

Ok, we are VIR_ARCH_AARCH64 so this does not trigger - which it would for all non-arms.
So this is already arm special for sure, the next check is:

/* This may not be true for all ARM machine types, but at least
 * the only supported non-virtio serial devices of vexpress and versatile
 * don't have the -chardev property wired up. */
return (chr->info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO ||
       (chr->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE &&
        chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_VIRTIO));

The chr is the serial object.
So it has either to be of type virtio or virtio-console - all others would fall back and cause the issue.

I checked this was that way since back to at least Xenial (or more).
So this is really not a bug/regression - it is a constraint of the arm platform that their serials can not represent the logfile and therefore the check returning false and due to that rejecting your logfile argument.

@admcleod - in regard to the initial bug this really is unrelated.

@admcleod - might this be an artifact of your "enabling debugging everywhere" or does nova set log= everywhere&all-the-time. Do you have an old case that worked with this or do we have to consider this just as an open issue that never was solved before?

ChristianEhrhardt (paelzer) wrote :

@admcleod - Assigning to you, since I had to report for the split of the bugs, but I want to make it more clear that I wait on you for now.

Changed in libvirt (Ubuntu):
assignee: nobody → Andrew McLeod (admcleod)
Andrew McLeod (admcleod) wrote :

When set debug=False, I still get the xml output and the error "logfile not supported in this QEMU binary" so the debug switch is not relevant

ChristianEhrhardt (paelzer) wrote :

Interesting - I outlined above that this is arm specific and should have occurred >=libvirt 2.6.
Do you happen to know (or able to recreated) if this worked on Newton (which came with Xenials libvirt 1.3.1 which didn't have that check).
If so did it have the log= in the XML content?

If it had it submitted something that didn't work properly and now is detected and refused by libvirt.

Andrew McLeod (admcleod) wrote :

Deploying instances works in newton - with debug enabled we can see that there is no "log file=" passed in the xml

https://pastebin.canonical.com/190844/

ChristianEhrhardt (paelzer) wrote :

Ok, so by that for now the approach is to fix that in Openstack (or Charms) then.
I'll keep it assigned to you, but incomplete and you can decide.

Ryan Beisner (1chb1n) on 2017-06-14
tags: added: arm64 uosci
Andrew McLeod (admcleod) wrote :

I've tested this on x86 (xenial ocata) - cpu-mode "none" works, as expected. Neither host-model nor host-passthrough work - the error how ever is not the same - in fact the xml to create the instance doesn't get passed through at all. It is a generic openstack instance creation error.

Error creating server: xenial-SERVERNAME
Error creating server

ChristianEhrhardt (paelzer) wrote :

This would be a comment for "the other" bug I think.
From libvirt/qemu alone host-passthrough works for sure and host-model as outlined before there needs newer qemu/libvirt versions to work as it was supposed to.

Andrew McLeod (admcleod) wrote :

Sorry, it was intended for this bug - the idea was to double-check these other modes on x86 for xenial ocata just to ensure we didn't get the same errors.

ChristianEhrhardt (paelzer) wrote :

FYI - there is discussion around the topic on https://www.redhat.com/archives/libvir-list/2017-June/msg00423.html

Ryan Beisner (1chb1n) wrote :

We are working to re-confirm this issue exists on Pike-B3, since that is where any upstream nova patches will be initially proposed.

James Page (james-page) wrote :

Digging on this today - I think we can summarise that use of virtlogd on arm* in the 2.5.0 version of libvirt in zesty and in the Ocata UCA is non-functional.

I'm poking at validating with the newer Pike UCA version from Artful.

James Page (james-page) wrote :

and the associated Linaro bug:

https://bugs.linaro.org/show_bug.cgi?id=2777

James Page (james-page) wrote :

Tested back-ported 3.5.0 binaries from Artful on Xenial; still has the same issue but that would be as expected as:

  https://github.com/libvirt/libvirt/commit/426dc5eb28bade109bf27bdd10d7305a040b4a3e

is not included in the 3.5.0 release.

James Page (james-page) wrote :

(FTR no changes in Nova codebase in this area - the Linaro team took the approach of fixing libvirt, rather than working around the underlying problem).

Changed in charm-nova-compute:
status: New → Invalid
Changed in libvirt (Ubuntu):
status: Incomplete → In Progress
assignee: Andrew McLeod (admcleod) → James Page (james-page)
summary: - logfile not supported in this QEMU binary
+ aarch64: logfile not supported in this QEMU binary
Changed in libvirt (Ubuntu):
importance: Undecided → Medium
Changed in nova (Ubuntu):
importance: Undecided → Medium
assignee: nobody → James Page (james-page)
status: New → In Progress
James Page (james-page) wrote :
Changed in nova (Ubuntu Zesty):
status: New → Triaged
Changed in libvirt (Ubuntu Zesty):
status: New → Triaged
importance: Undecided → Medium
Changed in nova (Ubuntu Zesty):
importance: Undecided → Medium
no longer affects: charm-nova-compute
Changed in libvirt (Ubuntu Artful):
status: In Progress → Triaged
assignee: James Page (james-page) → nobody
James Page (james-page) wrote :

I've made what at best can be called a compatibility patch for nova for use with libvirt 2.5.0 or 3.5.0; it switches the type of the console to virtio, which allows machines to boot, but no kernel boot messages are captured - the first output it the login prompt.

Ideally we would backport the fixes in development to both 2.5.x and 3.5.x in artful and zesty.

James Page (james-page) wrote :

Reviewing patchsets for libvirt; looks like 3.6.0 will have the required fixes for logfile support on aarch64, and the patchset to pick is approx 3 patches to 3.5.0.

The patchset for 2.5.0 is much larger so is probably no feasible as an SRU (Christian - would be good if you can confirm this opinion or not).

James Page (james-page) wrote :

3.5.0 cherry-picks:

git cherry-pick -x ca5c5b997b348e650a086965c5e975c7101ee40e
git cherry-pick -x 56540950e73d331fc04443409c578e4354337309
git cherry-pick -x 426dc5eb28bade109bf27bdd10d7305a040b4a3e

James Page (james-page) wrote :

OK - I've worked a slightly nicer patch to switch back to the pre-ocata method of device creation for console log capture; this skips the use of virtlogd (which is what the ocata changes make use of for pty devices) on arm64.

James Page (james-page) on 2017-07-27
Changed in nova (Ubuntu Zesty):
status: Triaged → In Progress
assignee: nobody → James Page (james-page)
description: updated
Raghuram Kota (rkota) wrote :

@James : Thanks for the new patch (comm #24)! Does this allow kernel boot messages to be also captured and address the limitation mentioned in comm #21 ? Thx

James Page (james-page) wrote :

@rkota

Reference: https://git.launchpad.net/~ubuntu-server-dev/ubuntu/+source/nova/tree/debian/patches/aarch64-libvirt-compat.patch

basically we revert to the pre-ocata behaviour from the driver; so you get as much as you got with OpenStack Newton.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 2:16.0.0~b2-0ubuntu2

---------------
nova (2:16.0.0~b2-0ubuntu2) artful; urgency=medium

  * d/tests/*: Drop nova-cert from DEP-8 tests.
  * d/p/aarch64-libvirt-compat.patch: Compatibility shim to resolve
    issues on aarch64 architecture (LP: #1697610).

 -- James Page <email address hidden> Thu, 27 Jul 2017 13:29:38 +0100

Changed in nova (Ubuntu Artful):
status: In Progress → Fix Released

Hello ChristianEhrhardt, or anyone else affected,

Accepted nova into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nova/2:15.0.6-0ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nova (Ubuntu Zesty):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-zesty
ChristianEhrhardt (paelzer) wrote :

Hi James,
thanks for your work on this.
I agree that the backport to 2.5 will likely be too complex.

For 3.5 I didn't see more dependencies in the code, but might find some when trying to do so.
I'm currently considering to move to libvirt 3.6 instead if that is working smoothly as I have a few other things that would need that as well. But for now I need to move up qemu before going back to libvirt.

Do I understand correctly that with the nova fix that uses the old console generation works for now (but lacks the early messages) and once/if a fixed libvirt is in artful we can consider dropping the change from nova again?.

tags: added: libvirt-3.6
James Page (james-page) wrote :

Hi Christian

With regards to the nova fix - yes the patch reverts the behaviour on ARM to use the old-style, functional on ARM way of doing console logs; we can drop it as/when libvirt on ARM64 dtrt again.

James Page (james-page) wrote :

Hello ChristianEhrhardt, or anyone else affected,

Accepted nova into ocata-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:ocata-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ocata-needed to verification-ocata-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ocata-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-ocata-needed
James Page (james-page) wrote :

xenial/ocata/proposed

Confirmed able to boot instances using the patched packages - console and serial XML snippets LGTM:

    <serial type='file'>
      <source path='/var/lib/nova/instances/d47e9996-d567-48ad-89e7-f8df8482e375/console.log'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <console type='file'>
      <source path='/var/lib/nova/instances/d47e9996-d567-48ad-89e7-f8df8482e375/console.log'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>

$ apt-cache policy nova-compute
nova-compute:
  Installed: 2:15.0.6-0ubuntu1~cloud0
  Candidate: 2:15.0.6-0ubuntu1~cloud0
  Version table:
 *** 2:15.0.6-0ubuntu1~cloud0 500
        500 http://ubuntu-cloud.archive.canonical.com/ubuntu xenial-proposed/ocata/main arm64 Packages
        100 /var/lib/dpkg/status
     2:15.0.5-0ubuntu1~cloud0 500
        500 http://ubuntu-cloud.archive.canonical.com/ubuntu xenial-updates/ocata/main arm64 Packages
     2:13.1.4-0ubuntu1 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial-updates/main arm64 Packages
     2:13.0.0-0ubuntu2 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial/main arm64 Packages

tags: added: verification-ocata-done
removed: verification-ocata-needed
Launchpad Janitor (janitor) wrote :
Download full text (9.6 KiB)

This bug was fixed in the package libvirt - 3.6.0-1ubuntu1

---------------
libvirt (3.6.0-1ubuntu1) artful; urgency=medium

  * Merged with Debian unstable (3.6)
    This closes several bugs:
    - aarch64: improved chardev handling (LP: #1697610)
    - Forbid locking memory without memtune (LP: #1708305)
  * Remaining changes:
    - Disable sheepdog (universe dependency)
    - Disable libssh2 support (universe dependency)
    - Disable firewalld support (universe dependency)
    - Disable selinux
    - Set qemu-group to kvm (for compat with older ubuntu)
    - Regularly clear AppArmor profiles for vms that no longer exist
    - Additional apport package-hook
    - Modifications to adapt for our delayed switch away from libvirt-bin (can
      be dropped >18.04).
      + d/p/ubuntu/libvirtd-service-add-bin-alias.patch: systemd: define alias
        to old service name so that old references work
      + d/p/ubuntu/libvirtd-init-add-bin-alias.patch: sysv init: define alias
        to old service name so that old references work
      + d/control: transitional package with the old name and maintainer
        scripts to handle the transition
    - Backwards compatible handling of group rename (can be dropped >18.04).
    - config details and autostart of default bridged network. Creating that is
      now the default in general, yet our solution provides the following on
      top as of today:
      + nat only on some ports <port start='1024' end='65535'/>
      + autostart the default network by default
      + do not autostart if 192.168.122.0 is already taken (e.g. in containers)
    - d/p/ubuntu/Allow-libvirt-group-to-access-the-socket.patch: This is
      the group based access to libvirt functions as it was used in Ubuntu
      for quite long.
      + d/p/ubuntu/daemon-augeas-fix-expected.patch fix some related tests
        due to the group access change.
    - ubuntu/parallel-shutdown.patch: set parallel shutdown by default.
    - d/p/ubuntu/enable-kvm-spice.patch: compat with older Ubuntu qemu/kvm
      which provided a separate kvm-spice.
    - d/p/ubuntu/storage-disable-gluster-test: gluster not enabled, skip test
    - d/p/ubuntu/ubuntu-libxl-qemu-path.patch: this change was split. The
      section that adapts the path of the emulator to the Debian/Ubuntu
      packaging is kept.
    - d/p/ubuntu/ubuntu-libxl-Fix-up-VRAM-to-minimum-requirements.patch: auto
      set VRAM to minimum requirements
    - d/p/ubuntu/xen-default-uri.patch: set default URI on xen hosts
    - Add libxl log directory
    - libvirt-uri.sh: Automatically switch default libvirt URI for users on
      Xen dom0 via user profile (was missing on changelogs before)
    - d/p/ubuntu/apibuild-skip-libvirt-common.h: drop libvirt-common.h from
      included_files to avoid build failures due to duplicate definitions.
    - Update README.Debian with Ubuntu changes
    - Convert libvirt0, libnss_libvirt and libvirt-dev to multi-arch.
    - Enable some additional features on ppc64el and s390x (for arch parity)
      + systemtap, zfs, numa and numad on s390x.
      + systemtap on ppc64el.
    - fix conffile upgrade handling to avoid obsolete files
      and inactive duplica...

Read more...

Changed in libvirt (Ubuntu Artful):
status: Triaged → Fix Released
ChristianEhrhardt (paelzer) wrote :

Libvirt 3.6 is in Artful now - if one wants to retest this for Artful/Pike if you can (want?) to drop the change to go back to old style console generation please feel free.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.