transient failures during lxc test during shutdown

Bug #1783198 reported by Scott Moser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Won't Fix
Medium
Scott Moser

Bug Description

We have been seeing a lot of transient failures

https://jenkins.ubuntu.com/server/job/cloud-init-integration-lxd-c/72/consoleFull
with a stack trace that looks like below.

I think that we might be attempting to delete an instance twice or shutting it down twice. not sure.

2018-07-20 12:20:30,781 - tests.cloud_tests - DEBUG - executing "collect: instance-id"
2018-07-20 12:20:46,612 - tests.cloud_tests - ERROR - stage: collect test data for cosmic encountered error: not found
2018-07-20 12:20:46,614 - tests.cloud_tests - ERROR - traceback:
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/tests/cloud_tests/stage.py", line 97, in run_stage
    (call_res, call_failed) = call()
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/tests/cloud_tests/collect.py", line 111, in collect_test_data
    instance.shutdown()
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/tests/cloud_tests/platforms/lxd/instance.py", line 171, in shutdown
    self.pylxd_container.stop(wait=wait)
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/.tox/citest/lib/python3.5/site-packages/pylxd/models/container.py", line 316, in stop
    wait=wait)
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/.tox/citest/lib/python3.5/site-packages/pylxd/models/container.py", line 291, in _set_state
    response.json()['operation'])
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/.tox/citest/lib/python3.5/site-packages/pylxd/models/operation.py", line 33, in wait_for_operation
    return cls.get(client, operation.id)
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/.tox/citest/lib/python3.5/site-packages/pylxd/models/operation.py", line 40, in get
    response = client.api.operations[operation_id].get()
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/.tox/citest/lib/python3.5/site-packages/pylxd/client.py", line 148, in get
    is_api=is_api)
  File "/var/lib/jenkins/slaves/torkoal/workspace/cloud-init-integration-lxd-c/cloud-init/.tox/citest/lib/python3.5/site-packages/pylxd/client.py", line 103, in _assert_response
    raise exceptions.NotFound(response)

Related branches

Scott Moser (smoser)
Changed in cloud-init:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Scott Moser (smoser)
Revision history for this message
Scott Moser (smoser) wrote :

I'm attaching all console logs that were available of lxd runs of integration
tests. The interesting thing is that *most* of the time when we see pylxd
related trace backs it is in shutdown (there was one in start). Also
interesting is that most of the time the traceback occurs after collection
of the last file, and about 25-30 seconds later.

So for shutdown specifically, it could be a result of the system failing
to shutdown.

Example:
2018-07-15 12:20:11,234 - tests.cloud_tests - DEBUG - executing "collect: result.json"
2018-07-15 12:20:28,612 - tests.cloud_tests - ERROR - stage: collect test data for bionic encountered error: not found
2018-07-15 12:20:28,615 - tests.cloud_tests - ERROR - traceback:

Revision history for this message
James Falcon (falcojr) wrote :

These tests no longer exist.

Changed in cloud-init:
status: Confirmed → Invalid
status: Invalid → Won't Fix
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.