/usr/bin/generate_vms.sh: Could not access KVM kernel module: Permission denied

Bug #1524978 reported by Maksym Strukov
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Michael Polenchuk
7.0.x
Fix Released
Critical
Alexey Stupnikov
8.0.x
Fix Released
High
Michael Polenchuk
Mitaka
Fix Released
High
Michael Polenchuk

Bug Description

Scenario:
1. Create cluster
2. Assign compute and virt roles to slave node
3. Upload configuration for one VM
4. Spawn VM
5. Wait till VM become available for allocation

Actual:
TimeoutError: Timeout waiting 2 available nodes, current nodes:
{0}Name: slave-01_compute_virt, status: ready, online: True

Nailgun app log:

'''
2015-12-09 21:33:37.585 INFO [7f33ad74f880] (manager) Trying to start deployment at cluster 'TestVirtRole'
2015-12-09 21:33:37.608 ERROR [7f33ad74f880] (helpers) create_action_log failed: 'spawn_vms'
2015-12-09 21:33:37.619 INFO [7f33ad74f880] (manager) ApplyChangesTask: execute async starting for task 1
2015-12-09 21:33:37.642 ERROR [7f33ad74f880] (helpers) Extracting of actor_id failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nailgun/task/helpers.py", line 317, in prepare_action_log_kwargs
    actor_id = action_log.actor_id
AttributeError: 'NoneType' object has no attribute 'actor_id'
2015-12-09 21:33:37.693 INFO [7f33ad74f880] (provisioning_serializers) Node slave-01_compute_virt (id=1, mac=64:5f:68:69:46:df) seems booted with bootstrap image
2015-12-09 21:33:37.818 ERROR [7f33ad74f880] (helpers) Extracting of actor_id failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nailgun/task/helpers.py", line 317, in prepare_action_log_kwargs
    actor_id = action_log.actor_id
AttributeError: 'NoneType' object has no attribute 'actor_id'
'''

Puppet log on node-1:

'''
2015-12-09 21:53:51 +0000 Exec[generate_vms](provider=posix) (debug): Executing '/usr/bin/generate_vms.sh /etc/libvirt/qemu /var/lib/nova'
2015-12-09 21:53:51 +0000 Puppet (debug): Executing '/usr/bin/generate_vms.sh /etc/libvirt/qemu /var/lib/nova'
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): error: failed to get domain '1_vm'
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): error: Domain not found: no domain with matching name '1_vm'
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): Disks for 1_vm, total number 1
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): Disk id: 1, disk type: file, disk format: qcow2, disk path: /var/lib/nova/1_vm.img, disk size: 60G
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): Formatting '/var/lib/nova/1_vm.img', fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 lazy_refcounts=off
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): Domain 1_vm defined from /tmp/tmp.KJmNFJNVXm
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice):
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): error: Failed to start domain 1_vm
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): error: internal error: process exited while connecting to monitor: Could not access KVM kernel module: Permission denied
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (notice): failed to initialize KVM: Permission denied
2015-12-09 21:53:52 +0000 Puppet (err): /usr/bin/generate_vms.sh /etc/libvirt/qemu /var/lib/nova returned 1 instead of one of [0]
/usr/lib/ruby/vendor_ruby/puppet/util/errors.rb:106:in `fail'
/usr/lib/ruby/vendor_ruby/puppet/type/exec.rb:160:in `sync'
/usr/lib/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:204:in `sync'
/usr/lib/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:128:in `sync_if_needed'
/usr/lib/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:89:in `block in perform_changes'
/usr/lib/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:88:in `each'
/usr/lib/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:88:in `perform_changes'
/usr/lib/ruby/vendor_ruby/puppet/transaction/resource_harness.rb:20:in `evaluate'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:204:in `apply'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:217:in `eval_resource'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:147:in `call'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:147:in `block (2 levels) in evaluate'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:335:in `block in thinmark'
/usr/lib/ruby/1.9.1/benchmark.rb:295:in `realtime'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:334:in `thinmark'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:147:in `block in evaluate'
/usr/lib/ruby/vendor_ruby/puppet/graph/relationship_graph.rb:118:in `traverse'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:138:in `evaluate'
/usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:169:in `block in apply'
/usr/lib/ruby/vendor_ruby/puppet/util/log.rb:149:in `with_destination'
/usr/lib/ruby/vendor_ruby/puppet/transaction/report.rb:112:in `as_logging_destination'
/usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:168:in `apply'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:120:in `block in apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:161:in `block in benchmark'
/usr/lib/ruby/1.9.1/benchmark.rb:295:in `realtime'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:160:in `benchmark'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:119:in `apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:227:in `run_internal'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:134:in `block in run'
/usr/lib/ruby/vendor_ruby/puppet/context.rb:64:in `override'
/usr/lib/ruby/vendor_ruby/puppet.rb:244:in `override'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:133:in `run'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:302:in `apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:236:in `block in main'
/usr/lib/ruby/vendor_ruby/puppet/context.rb:64:in `override'
/usr/lib/ruby/vendor_ruby/puppet.rb:244:in `override'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:198:in `main'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:159:in `run_command'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:381:in `block (2 levels) in run'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:507:in `plugin_hook'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:381:in `block in run'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:496:in `exit_on_fail'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:381:in `run'
/usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:146:in `run'
/usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:92:in `execute'
/usr/bin/puppet:8:in `<main>'
2015-12-09 21:53:52 +0000 /Stage[main]/Main/Exec[generate_vms]/returns (err): change from notrun to 0 failed: /usr/bin/generate_vms.sh /etc/libvirt/qemu /var/lib/nova returned 1 instead of one of [0]
'''

Env: 8.0-277

Revision history for this message
Maksym Strukov (unbelll) wrote :
Changed in fuel:
milestone: none → 8.0
status: New → Confirmed
tags: added: swarm-fail-driver
tags: added: area-library
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
tags: added: team-bugfix
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Bartosz Kupidura (zynzel)
Revision history for this message
Dmitriy Kruglov (dkruglov) wrote :

At the moment the current issue is hidden by the following one: https://bugs.launchpad.net/fuel/+bug/1526448

Revision history for this message
Dmitriy Kruglov (dkruglov) wrote :

The issue is not reproduced (8.0, build 288) on local and swarm CI server environments (srv45-bud.infra.mirantis.net), so moving the ticket to invalid.

Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Nastya Urlapova (aurlapova) wrote :

@Dima, as usual we set Incomplete state for such features and after month move it to Invalid.

Changed in fuel:
status: Invalid → Incomplete
Revision history for this message
Sergey Shevorakov (sshevorakov) wrote :

Changing tag from swarm-fail-driver to swarm-blocker since this bug fails 3 test cases (0.9% of all).

tags: added: swarm-blocker
removed: swarm-fail-driver
Revision history for this message
Mike Scherbakov (mihgen) wrote :

How can it be Incomplete and fail 3 test cases at the same time... ? If issue is present, then it has to be confirmed. Engineers might not be able to triage it, i.e. identify root cause of an issue, but issue itself is present.

Please move to confirmed, if you still see tests failing. Attach new diagnostic snapshots / logs too, so it can be easier to identify an issue.

Revision history for this message
Sergey Shevorakov (sshevorakov) wrote :

Moved to "Confirmed", since the bug still affecting the test cases (3).
https://mirantis.testrail.com/index.php?/plans/view/5199

Changed in fuel:
status: Incomplete → Confirmed
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

Move to invalid, it is not blocked CI for now (test failed before) So I deploy case manually and issue is not reproduced:
VERSION:
  feature_groups:
    - advanced
    - mirantis
  production: "docker"
  release: "8.0"
  api: "1.0"
  build_number: "361"
  build_id: "361"
  fuel-nailgun_sha: "53c72a9600158bea873eec2af1322a716e079ea0"
  python-fuelclient_sha: "4f234669cfe88a9406f4e438b1e1f74f1ef484a5"
  fuel-agent_sha: "7463551bc74841d1049869aaee777634fb0e5149"
  fuel-nailgun-agent_sha: "92ebd5ade6fab60897761bfa084aefc320bff246"
  astute_sha: "c7ca63a49216744e0bfdfff5cb527556aad2e2a5"
  fuel-library_sha: "ba8063d34ff6419bddf2a82b1de1f37108d96082"
  fuel-ostf_sha: "889ddb0f1a4fa5f839fd4ea0c0017a3c181aa0c1"
  fuel-mirror_sha: "8adb10618bb72bb36bb018386d329b494b036573"
  fuelmenu_sha: "824f6d3ebdc10daf2f7195c82a8ca66da5abee99"
  shotgun_sha: "63645dea384a37dde5c01d4f8905566978e5d906"
  network-checker_sha: "9f0ba4577915ce1e77f5dc9c639a5ef66ca45896"
  fuel-upgrade_sha: "616a7490ec7199f69759e97e42f9b97dfc87e85b"
  fuelmain_sha: "07d5f1c3e1b352cb713852a3a96022ddb8fe2676"

Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Dmitry Kalashnik (dkalashnik) wrote :

Moving to New.
Issue is not 100% reproducible. I assume ~50% rate.

Changed in fuel:
status: Invalid → New
Revision history for this message
Bartosz Kupidura (zynzel) wrote :

On failed environment /dev/kvm have wrong permissions set after instalation:
crw------- 1 root root 10, 232 Jan 20 02:09 /dev/kvm

Should be:
crw-rw---- 1 root kvm 10, 232 Jan 20 09:48 /dev/kvm

I will ping mos-linux.

Revision history for this message
Bartosz Kupidura (zynzel) wrote :

Part of qemu-system-common.postinst:

# if we just installed the package, udev rules aren't picked up yet,
# so udev may have created the device (/dev/kvm) with default permissions.
# Fix it here, but only if the perms are like default.
# (See #607391)

if [ -c /dev/kvm -a ! -L /dev/kvm ] && [ .$(stat -c %u%g /dev/kvm) = .00 ]
then
  chgrp kvm /dev/kvm
  chmod 0660 /dev/kvm
fi

Fix:
apt-get install --reinstall qemu-system-common

qemu-system-common is installed as dependency for qemu-system which is dependency for qemu-kvm

Revision history for this message
Dmitry Teselkin (teselkin-d) wrote :

There is a bug in Ubuntu [1], but it should be fixed in qemu 2.0.0 (we use qemu from upstream)

[1] https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1057024

tags: added: feature-reduced-footprint
Revision history for this message
Aleksander Mogylchenko (amogylchenko) wrote :

There is a race condition somewhere inside Puppet. It can be easily reproduced by:
1. apt-get purge qemu-keymaps qemu-kvm qemu-system-common qemu-system-x86 qemu-utils
2. rmmod kvm_intel
3. rmmod kvm
4. puppet apply /etc/puppet/modules/osnailyfacter/modular/generate_vms/generate_vms.pp

On the Jenkins environment this produces wrong permissions in 100% cases. Most likely it happens because puppet loads the module before udev rules are processed. If I make a small change:
root@node-2:~# diff -u /etc/puppet/modules/osnailyfacter/modular/generate_vms/generate_vms.pp.orig /etc/puppet/modules/osnailyfacter/modular/generate_vms/generate_vms.pp
--- /etc/puppet/modules/osnailyfacter/modular/generate_vms/generate_vms.pp.orig 2016-01-26 08:05:26.474700838 +0000
+++ /etc/puppet/modules/osnailyfacter/modular/generate_vms/generate_vms.pp 2016-01-26 08:03:49.097281480 +0000
@@ -2,7 +2,7 @@

 $libvirt_dir = '/etc/libvirt/qemu'
 $template_dir = '/var/lib/nova'
-$packages = ['qemu-utils', 'qemu-kvm', 'libvirt-bin', 'xmlstarlet']
+$packages = ['qemu-utils', 'qemu-kvm', 'libvirt-bin', 'xmlstarlet', 'qemu-system-common']
 $libvirt_service_name = 'libvirtd'

 $vms = hiera_array('vms_conf')

It will then produce correct /dev/kvm permissions in about 50% of cases.

Running apt-get install qemu-kvm always produces correct permissions.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/273443

Changed in fuel:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/274051

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/273443
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=6ace135380cbac51b05e10b1bb96f39ddb3dd7ee
Submitter: Jenkins
Branch: master

commit 6ace135380cbac51b05e10b1bb96f39ddb3dd7ee
Author: Michael Polenchuk <email address hidden>
Date: Thu Jan 28 13:17:19 2016 +0300

    Set permissions for /dev/kvm

    if we just installed the package, udev rules aren't picked up yet,
    so udev may have created the device (/dev/kvm) with default
    permissions which affects reduced footprint feature.

    Change-Id: I066852220ee75defb6cf1bc364bea32c098fa307
    Closes-Bug: #1524978

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/274051
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=4e977279386d0af35f072390249256f3f65dda33
Submitter: Jenkins
Branch: stable/8.0

commit 4e977279386d0af35f072390249256f3f65dda33
Author: Michael Polenchuk <email address hidden>
Date: Thu Jan 28 13:17:19 2016 +0300

    Set permissions for /dev/kvm

    if we just installed the package, udev rules aren't picked up yet,
    so udev may have created the device (/dev/kvm) with default
    permissions which affects reduced footprint feature.

    Change-Id: I066852220ee75defb6cf1bc364bea32c098fa307
    Closes-Bug: #1524978

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Ksenia Svechnikova (kdemina) wrote :

Verify on fuel-9.0-220-2016 ISO, swarm tests is green

Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
Ksenia Svechnikova (kdemina) wrote :
tags: added: customer-found
Revision history for this message
Rene Soto (rsoto) wrote :

sla1 for 7.0-mu-5

tags: added: sla1
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

Steps to reproduce from comment #15 doesn't work for MOS 7. I will try to figure out something else.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

I see two possible conditions to catch this bug:
1. There is no qemu-system-common package (it provides scripts to setup correct permissions on /dev/kvm) on compute nodes
2. puppets was changed and puppet mixed their relative order (puppet doesn't follow the order from original manifest file, but executes resources statements in a predictable manner).

So we don't need to verify this bug, it looks like a possible issue. We just have to fix it.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/351632

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/7.0)

Reviewed: https://review.openstack.org/351632
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=8c581b776ec805d18207b73033d0f463f7789507
Submitter: Jenkins
Branch: stable/7.0

commit 8c581b776ec805d18207b73033d0f463f7789507
Author: Michael Polenchuk <email address hidden>
Date: Thu Jan 28 13:17:19 2016 +0300

    Set permissions for /dev/kvm

    if we just installed the package, udev rules aren't picked up yet,
    so udev may have created the device (/dev/kvm) with default
    permissions which affects reduced footprint feature.

    Change-Id: I066852220ee75defb6cf1bc364bea32c098fa307
    Closes-Bug: #1524978
    (cherry picked from commit 4e977279386d0af35f072390249256f3f65dda33)

Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Fix is available in MOS 7.0 MU5 updates. Move to Fix Released.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.