fuel-devops failed to revert snapshot (libvirt.libvirtError: Requested operation is not valid: domain is not running)

Bug #1641568 reported by Ilya Bumarskov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Fuel QA Team

Bug Description

Fuel devops actions (revert snapshots/ erase env) lead to fail. This bug is floating and affects the PCE CI (vCenter/DVS/NSXv/NSX-t suites - http://jenkins-tpi.bud.mirantis.net:8080)

Observed behaviour:
http://jenkins-tpi.bud.mirantis.net:8080/job/9.x.vcenter.vcenter_dvs_smoke/31/console

dos.py erase 9.1.manila.343
Traceback (most recent call last):
  File "/home/jenkins/90-venv/bin/dos.py", line 22, in <module>
    main()
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/shell.py", line 610, in main
    Shell(args).execute()
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/shell.py", line 54, in execute
    self.commands.get(self.params.command)(self)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/shell.py", line 91, in do_erase
    self.env.erase()
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/models/environment.py", line 155, in erase
    node.erase()
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/models/node.py", line 207, in erase
    self.remove(verbose=False)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/models/node.py", line 214, in remove
    self.destroy(verbose=False)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/models/node.py", line 204, in destroy
    self.driver.node_destroy(self)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/helpers/retry.py", line 27, in wrapper
    return func(*args, **kwargs)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/driver/libvirt/libvirt_driver.py", line 502, in node_destroy
    self.conn.lookupByUUIDString(node.uuid).destroy()
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/libvirt.py", line 922, in destroy
    if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self)
libvirt.libvirtError: Requested operation is not valid: domain is not running

http://jenkins-tpi.bud.mirantis.net:8080/job/9.x.nsxt.nsxt_bvt/188/console
2016-11-13 19:36:07 - ERROR decorators.py:126 -- Traceback (most recent call last):
  File "/home/jenkins/workspace/9.x.nsxt.nsxt_bvt/plugin_test/fuel-qa/fuelweb_test/helpers/decorators.py", line 120, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.x.nsxt.nsxt_bvt/plugin_test/tests/test_plugin_nsxt.py", line 169, in nsxt_bvt
    self.env.revert_snapshot("ready_with_5_slaves")
  File "/home/jenkins/workspace/9.x.nsxt.nsxt_bvt/plugin_test/fuel-qa/fuelweb_test/models/environment.py", line 305, in revert_snapshot
    self.d_env.revert(name)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/models/environment.py", line 186, in revert
    node.destroy(verbose=False)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/models/node.py", line 204, in destroy
    self.driver.node_destroy(self)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/helpers/retry.py", line 27, in wrapper
    return func(*args, **kwargs)
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/devops/driver/libvirt/libvirt_driver.py", line 502, in node_destroy
    self.conn.lookupByUUIDString(node.uuid).destroy()
  File "/home/jenkins/90-venv/local/lib/python2.7/site-packages/libvirt.py", line 1059, in destroy
    if ret == -1: raise libvirtError ('virDomainDestroy() failed', dom=self)
libvirtError: Requested operation is not valid: domain is not running

Tags: area-qa
Changed in fuel:
importance: Undecided → High
milestone: none → 9.2
Changed in fuel:
assignee: nobody → Fuel QA Team (fuel-qa)
tags: added: area-qa
Revision history for this message
Dennis Dmitriev (ddmitriev) wrote :

The issue is connected to a very slow operations with network devices because of running VMWare on the same server.
Libvirt 1.2.12 raises such error after almost every VMs destroyed.

Recommendations:
1. Move VMWare workloads on an another server if it is possible
2. Make workarounds for libvirt.

Revision history for this message
Dennis Dmitriev (ddmitriev) wrote :

Workaround for fuel-devops2.9(bugfix only): https://review.openstack.org/#/c/399558/
Workaround for fuel-devops3.0: https://review.openstack.org/#/c/399555/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-devops (release/2.9)

Reviewed: https://review.openstack.org/399558
Committed: https://git.openstack.org/cgit/openstack/fuel-devops/commit/?id=125cdabc3492e7355183207dcd937eaf64ebf9f9
Submitter: Jenkins
Branch: release/2.9

commit 125cdabc3492e7355183207dcd937eaf64ebf9f9
Author: Dennis Dmitriev <email address hidden>
Date: Fri Nov 18 13:55:59 2016 +0200

    Workaround for LP#1641568

    If a server performs a very slow operations with network devices,
    libvirt can raise an exception when destroy a domain.
    However, domain is actually destroyed.

    Make a workaround to continue with other domains.

    Related-Bug:#1641568

    Change-Id: I0c38dd1977b410723a530724d34fe2ac231cf0b3

Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :

@ibumarskov please check that problem was solved.

Changed in fuel:
status: New → Fix Committed
Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-devops (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/429720

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-devops (master)

Reviewed: https://review.openstack.org/429720
Committed: https://git.openstack.org/cgit/openstack/fuel-devops/commit/?id=aa4679ce01901d57dc6a57dac42aa6b466bc6dbf
Submitter: Jenkins
Branch: master

commit aa4679ce01901d57dc6a57dac42aa6b466bc6dbf
Author: Dennis Dmitriev <email address hidden>
Date: Mon Feb 6 16:12:57 2017 +0200

    Fix unit tests for libvirt 2.5+

    In libvirt v2.5+, domain.destroy() raises an error in unit tests:
    "Requested operation is not valid: domain is not running"

    - check if the domain is active in the unit tests.

    Change-Id: I92420699738ea3e5ece4e9137ac797fbcbc015ea
    Related-bug:#1641568

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.