periodic-tempest-dsvm-neutron-full-ssh-master fails on the gate - libguestfs installed but not usable (/usr/bin/supermin exited with error status 1.

Bug #1646002 reported by Ken'ichi Ohmichi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned
devstack
Fix Released
High
Andrea Frittoli
tempest
Fix Released
High
Andrea Frittoli

Bug Description

The log is http://logs.openstack.org/periodic/periodic-tempest-dsvm-neutron-full-ssh-master/14ef08a/logs/

test_create_server_with_personality failed like

Traceback (most recent call last):
  File "tempest/api/compute/servers/test_server_personality.py", line 63, in test_create_server_with_personality
    validatable=True)
  File "tempest/api/compute/base.py", line 233, in create_test_server
    **kwargs)
  File "tempest/common/compute.py", line 167, in create_test_server
    % server['id'])
  File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "tempest/common/compute.py", line 149, in create_test_server
    clients.servers_client, server['id'], wait_until)
  File "tempest/common/waiters.py", line 75, in wait_for_server_status
    server_id=server_id)
tempest.exceptions.BuildErrorException: Server 55df9d1c-3316-43a5-81fe-63ff10216b5e failed to build and is in ERROR status
Details: {u'message': u'No valid host was found. There are not enough hosts available.', u'code': 500, u'created': u'2016-11-29T06:28:57Z'}

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

3 of 4 tests in tempest.api.compute.servers.test_server_personality.ServerPersonalityTestJSON failed.

I am curious why the other create-server tests didn't fail at all except these tests related to personality.

Changed in tempest:
importance: Undecided → Medium
Revision history for this message
Masayuki Igawa (igawa) wrote :
Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

nova-sched.log:

2016-11-29 06:28:56.980 13254 DEBUG nova.filters [req-6e0f5671-b892-4ba3-a5f7-8055760c9887 tempest-ServerPersonalityTestJSON-701780416 tempest-ServerPersonalityTestJSON-701780416] Starting
with 1 host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
2016-11-29 06:28:56.980 13254 INFO nova.scheduler.filters.retry_filter [req-6e0f5671-b892-4ba3-a5f7-8055760c9887 tempest-ServerPersonalityTestJSON-701780416 tempest-ServerPersonalityTestJSO
N-701780416] Host [u'ubuntu-xenial-osic-cloud1-disk-5852273', u'ubuntu-xenial-osic-cloud1-disk-5852273'] fails. Previously tried hosts: [[u'ubuntu-xenial-osic-cloud1-disk-5852273', u'ubunt
u-xenial-osic-cloud1-disk-5852273']]
2016-11-29 06:28:56.981 13254 INFO nova.filters [req-6e0f5671-b892-4ba3-a5f7-8055760c9887 tempest-ServerPersonalityTestJSON-701780416 tempest-ServerPersonalityTestJSON-701780416] Filter Ret
ryFilter returned 0 hosts
2016-11-29 06:28:56.981 13254 DEBUG nova.filters [req-6e0f5671-b892-4ba3-a5f7-8055760c9887 tempest-ServerPersonalityTestJSON-701780416 tempest-ServerPersonalityTestJSON-701780416] Filtering
 removed all hosts for the request with instance ID '55df9d1c-3316-43a5-81fe-63ff10216b5e'. Filter results: [('RetryFilter', None)] get_filtered_objects /opt/stack/new/nova/nova/filters.py:
129
2016-11-29 06:28:56.981 13254 INFO nova.filters [req-6e0f5671-b892-4ba3-a5f7-8055760c9887 tempest-ServerPersonalityTestJSON-701780416 tempest-ServerPersonalityTestJSON-701780416] Filtering
removed all hosts for the request with instance ID '55df9d1c-3316-43a5-81fe-63ff10216b5e'. Filter results: ['RetryFilter: (start: 1, end: 0)']

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

These tests should be skipped as https://review.openstack.org/#/c/336953/

So this problem seems misconfiguration on the periodic job.

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

The project-config changes it as enabled with https://review.openstack.org/#/c/336955/

This seems intentionally, we need to get feedback from Matt.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (5.9 KiB)

It's a libguestfs packaging issue on the xenial nodes:

http://logs.openstack.org/periodic/periodic-tempest-dsvm-neutron-full-ssh-master/14ef08a/logs/screen-n-cpu.txt.gz?level=TRACE#_2016-11-29_06_28_51_129

2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [req-b94c54fd-74aa-4c33-a1c1-db7862b72072 tempest-ServerPersonalityTestJSON-701780416 tempest-ServerPersonalityTestJSON-701780416] [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] Instance failed to spawn
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] Traceback (most recent call last):
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/opt/stack/new/nova/nova/compute/manager.py", line 2117, in _build_resources
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] yield resources
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/opt/stack/new/nova/nova/compute/manager.py", line 1924, in _build_and_run_instance
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] block_device_info=block_device_info)
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 2634, in spawn
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] admin_pass=admin_password)
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 3040, in _create_image
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] fallback_from_host)
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 3157, in _create_and_inject_local_root
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] files)
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 2988, in _inject_data
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] instance=instance)
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] self.force_reraise()
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e154911] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-11-29 06:28:51.130 13891 ERROR nova.compute.manager [instance: eae07c9b-2fb7-4ca3-921d-55545e15491...

Read more...

summary: - periodic-tempest-dsvm-neutron-full-ssh-master fails on the gate
+ periodic-tempest-dsvm-neutron-full-ssh-master fails on the gate -
+ libguestfs installed but not usable (/usr/bin/supermin exited with error
+ status 1.
Revision history for this message
Matt Riedemann (mriedem) wrote :

There is an old ubuntu bug for the same thing:

https://bugs.launchpad.net/ubuntu/+source/libguestfs/+bug/1086974

But that says it was fixed long ago.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

Marked this invalid for tempest as it's not a tempest issue, it's a setup issue in devstack or maybe something we can workaround in nova.

Changed in tempest:
status: New → Confirmed
status: Confirmed → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/405076

Changed in devstack:
assignee: nobody → Andrea Frittoli (andrea-frittoli)
Changed in devstack:
status: New → In Progress
Revision history for this message
Andrea Frittoli (andrea-frittoli) wrote :

I proposed a fix on devstack side. I think it would be good to have it documented in nova that to enable file injection on ubuntu with libvirt one must loosen the ACL to the kernel image.

Revision history for this message
Masayuki Igawa (igawa) wrote :

to avoid job failure, I made a skip patch for tempest

https://review.openstack.org/#/c/405076/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/405283

Revision history for this message
Andrea Frittoli (andrea-frittoli) wrote :

Even after fixing the kernel permissions, it doesn't work. Attaching the log file from nova with libguestfs tracing on.

Revision history for this message
Andrea Frittoli (andrea-frittoli) wrote :
Revision history for this message
Andrea Frittoli (andrea-frittoli) wrote :

Apparently libguestfs looks for /etc in the disk image, and it fails when it doesn't find it:

2016-12-01 17:02:53.644 12488 DEBUG nova.virt.disk.vfs.guestfs [-] event=trace eh=0 buf='aug_get "/files/etc/passwd/root/uid"' array=[] log_callback /opt/stack/new/nova/nova/virt/disk/vfs/guestfs.py:105
2016-12-01 17:02:53.645 12488 DEBUG nova.virt.disk.vfs.guestfs [-] event=appliance eh=0 buf='guestfsd: main_loop: proc 33 (mkdir_p) took 0.01 seconds' array=[] log_callback /opt/stack/new/nova/nova/virt/disk/vfs/guestfs.py:105
2016-12-01 17:02:53.648 12488 DEBUG nova.virt.disk.vfs.guestfs [-] event=appliance eh=0 buf='guestfsd: main_loop: new request, len 0x48^M
guestfsd: error: no matching node^M
guestfsd: main_loop: proc 19 (aug_get) took 0.00 seconds' array=[] log_callback /opt/stack/new/nova/nova/virt/disk/vfs/guestfs.py:105
2016-12-01 17:02:53.648 12488 DEBUG nova.virt.disk.vfs.guestfs [-] event=trace eh=0 buf='aug_get = NULL (error)' array=[] log_callback /opt/stack/new/nova/nova/virt/disk/vfs/guestfs.py:105

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/401366
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e604946018fb46103d7684b93d8d97819229529c
Submitter: Jenkins
Branch: master

commit e604946018fb46103d7684b93d8d97819229529c
Author: Kashyap Chamarthy <email address hidden>
Date: Wed Nov 23 18:26:33 2016 +0100

    guestfs: Don't report exception if there's read access to kernel

    Commit 92ae0f1 ("libvirt - Add log if libguestfs can't read host
    kernel") reworks the logic of handling access to Kernel for libguestfs.
    In doing that, it erroneously raises an exception when libguestfs is
    _able_ to access the Kernel.

    Fix it by reporting exception only when libguestfs does _not_ have
    read access to the Kernel.

    This was first tried by Kevin Zhao here
    Ic6802650cb8f93e0d02c51e9014eb85a7e71f6fe, but is abandoned for some
    reason.

    This also adds a little more direction on what to do to fix this error.

    Related-Bug: #1646002

    Change-Id: Id6b4108e4e4af7c98b3e1bd9de3a09ad057e7b92

Revision history for this message
Matt Riedemann (mriedem) wrote :

Marked this invalid for nova as I don't think there is something that we can fix in nova for this, it seems to be a deployment issue.

Changed in nova:
status: New → Invalid
Changed in devstack:
importance: Undecided → High
no longer affects: tempest
Revision history for this message
Andrea Frittoli (andrea-frittoli) wrote :

Gate job switched to xenial, seeing the same issue as in my devstack:

RuntimeError: aug_get: no matching node

http://logs.openstack.org/83/405283/5/check/gate-tempest-dsvm-neutron-full-ssh/3b42bdb/logs/screen-n-cpu.txt.gz?level=ERROR#_2016-12-03_20_42_31_792

Revision history for this message
Andrea Frittoli (andrea-frittoli) wrote :

The root disk for the cirros image is blank before boot.

The boot process starts from initrd. The file system in initrd is then copied to /dev/vda and boot continues from there. Injection happens before boot, thus there's no /etc folder found.

The test should inject to / instead.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/406914

Changed in tempest:
status: New → In Progress
Changed in tempest:
assignee: nobody → Andrea Frittoli (andrea-frittoli)
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tempest (master)

Change abandoned by Andrea Frittoli (<email address hidden>) on branch: master
Review: https://review.openstack.org/405283

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to devstack (master)

Reviewed: https://review.openstack.org/404981
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=1c442eebc8fe005af453bd610e750a1919a2b3ed
Submitter: Jenkins
Branch: master

commit 1c442eebc8fe005af453bd610e750a1919a2b3ed
Author: Andrea Frittoli <email address hidden>
Date: Wed Nov 30 20:44:44 2016 +0000

    Fix libguestfs on Ubuntu

    libguestfs does not work on ubuntu because the kernel is not
    world readable. This breaks file injection with libvirt.
    See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/759725
    for more details.

    The workaround proposed by Ubuntu is to relax the kernel ACL
    if needed, so we need to do that in case file injection is
    enabled on an Ubuntu host running libvirt.

    Partial-bug: #1646002
    Change-Id: I405793b9e145308e51a08710d8e5df720aec6fde

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/406914
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=9fb9d55ec55e4f5105de0cd6f19b530786ec91a2
Submitter: Jenkins
Branch: master

commit 9fb9d55ec55e4f5105de0cd6f19b530786ec91a2
Author: Andrea Frittoli <email address hidden>
Date: Mon Dec 5 12:22:25 2016 +0000

    Change personality inject path to /

    The CirrOS image root disk is empty, it's only populated during
    boot from the initrd image. So we can only safely inject files
    before boot into '/' directly.

    Closes-bug: #1646002

    Depends-on: I405793b9e145308e51a08710d8e5df720aec6fde
    Change-Id: I2092059acdeab0755215e7ae690e243b5b4df367

Changed in tempest:
status: In Progress → Fix Released
Changed in devstack:
status: In Progress → Fix Released
Changed in tempest:
status: Fix Released → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tempest (master)

Change abandoned by Masayuki Igawa (<email address hidden>) on branch: master
Review: https://review.openstack.org/405076
Reason: Oh, Andread's patch was merged, already. Cool!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/407037
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ce577442b06372b37ba849a441ab3f4cb6353da8
Submitter: Jenkins
Branch: master

commit ce577442b06372b37ba849a441ab3f4cb6353da8
Author: Andrea Frittoli <email address hidden>
Date: Mon Dec 5 15:34:11 2016 +0000

    Guestfs handle no passwd or group in image

    When setting ownership of a file or directory, the guestfs driver
    looks for the /etc/passwd and/or /etc/group files. In case they
    are not found, the current driver lets the auges RuntimeError
    through, which does not produce a very helpful error message.
    Fixing that by handling the original exception and rasing a
    Nova exception with more details in it.

    Related-bug: #1646002

    Change-Id: I2d15865c8be13b938e10e67c1b1b160f2a80f0c0

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/newton)

Related fix proposed to branch: stable/newton
Review: https://review.openstack.org/409707

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/409707
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b7e47f8a88728192f4035cf97c2b818a843c1742
Submitter: Jenkins
Branch: stable/newton

commit b7e47f8a88728192f4035cf97c2b818a843c1742
Author: Andrea Frittoli <email address hidden>
Date: Mon Dec 5 15:34:11 2016 +0000

    Guestfs handle no passwd or group in image

    When setting ownership of a file or directory, the guestfs driver
    looks for the /etc/passwd and/or /etc/group files. In case they
    are not found, the current driver lets the auges RuntimeError
    through, which does not produce a very helpful error message.
    Fixing that by handling the original exception and rasing a
    Nova exception with more details in it.

    Related-bug: #1646002

    Change-Id: I2d15865c8be13b938e10e67c1b1b160f2a80f0c0
    (cherry picked from commit ce577442b06372b37ba849a441ab3f4cb6353da8)

tags: added: in-stable-newton
Changed in tempest:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to devstack (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/476368

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to devstack (stable/newton)

Reviewed: https://review.openstack.org/476368
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=5d76eaf9376e23f7da674b03fe41c823536595bb
Submitter: Jenkins
Branch: stable/newton

commit 5d76eaf9376e23f7da674b03fe41c823536595bb
Author: Andrea Frittoli <email address hidden>
Date: Wed Nov 30 20:44:44 2016 +0000

    Fix libguestfs on Ubuntu

    libguestfs does not work on ubuntu because the kernel is not
    world readable. This breaks file injection with libvirt.
    See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/759725
    for more details.

    The workaround proposed by Ubuntu is to relax the kernel ACL
    if needed, so we need to do that in case file injection is
    enabled on an Ubuntu host running libvirt.

    Partial-bug: #1646002
    Change-Id: I405793b9e145308e51a08710d8e5df720aec6fde
    (cherry picked from commit 1c442eebc8fe005af453bd610e750a1919a2b3ed)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.