Neutron Linuxbridge jobs failing with 'operation failed: failed to read XML'

Bug #1612281 reported by Ihar Hrachyshka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Daniel Berrange
neutron
Invalid
Critical
Unassigned
os-vif
Invalid
Undecided
Unassigned

Bug Description

Example: http://logs.openstack.org/64/353664/2/check/gate-tempest-dsvm-neutron-linuxbridge/591295c/console.html#_2016-08-11_12_57_42_191711

2016-08-11 12:57:42.191740 | Traceback (most recent call last):
2016-08-11 12:57:42.191760 | File "tempest/test.py", line 106, in wrapper
2016-08-11 12:57:42.191779 | return f(self, *func_args, **func_kwargs)
2016-08-11 12:57:42.191814 | File "tempest/scenario/test_server_advanced_ops.py", line 90, in test_server_sequence_suspend_resume
2016-08-11 12:57:42.191825 | 'ACTIVE')
2016-08-11 12:57:42.191851 | File "tempest/common/waiters.py", line 75, in wait_for_server_status
2016-08-11 12:57:42.191865 | server_id=server_id)
2016-08-11 12:57:42.191905 | tempest.exceptions.BuildErrorException: Server 15dcd67e-dd8a-4805-b14b-798c6d2a6e87 failed to build and is in ERROR status
2016-08-11 12:57:42.191942 | Details: {u'message': u'operation failed: failed to read XML', u'created': u'2016-08-11T12:52:28Z', u'code': 500}

In nova-cpu logs, we see the same error: http://logs.openstack.org/64/353664/2/check/gate-tempest-dsvm-neutron-linuxbridge/591295c/logs/screen-n-cpu.txt.gz?level=TRACE#_2016-08-11_12_55_21_025

Note that the log contains other suspicious errors, like:

- privsep unexpected errors: http://logs.openstack.org/64/353664/2/check/gate-tempest-dsvm-neutron-linuxbridge/591295c/logs/screen-n-cpu.txt.gz?level=TRACE#_2016-08-11_12_25_47_284
- libvirt failing to locate brq: http://logs.openstack.org/64/353664/2/check/gate-tempest-dsvm-neutron-linuxbridge/591295c/logs/screen-n-cpu.txt.gz?level=TRACE#_2016-08-11_12_26_13_346

Agent logs seems more or less clean.

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

As per grafana, the failures started at around Aug 11, 3am. Patches that were merged around that time and are related to linuxbridge are: https://review.openstack.org/298443 and https://review.openstack.org/353264

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Also, Nova switched to OS-VIF, but that's probably not relevant (?) since it was merged a lot later: https://review.openstack.org/#/c/350595/5

tags: added: gate-failure linuxbridge
Changed in neutron:
importance: Undecided → Critical
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

OK, seems like as per logstash: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22operation%20failed%3A%20failed%20to%20read%20XML%5C%22

the issue started to hit the gate right after os-vif patch landed (first hits ~5am).

Revision history for this message
Daniel Berrange (berrange) wrote :
Changed in nova:
importance: Undecided → Critical
status: New → Fix Committed
Changed in neutron:
status: New → Invalid
Changed in os-vif:
status: New → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/354143
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=26399c700577f7a98213ec908dd2f478270f494e
Submitter: Jenkins
Branch: master

commit 26399c700577f7a98213ec908dd2f478270f494e
Author: Daniel P. Berrange <email address hidden>
Date: Thu Aug 11 16:11:01 2016 +0100

    network: fix handling of linux-bridge in os-vif conversion

    The nova.network.model.Network class uses names
    'should_create_{bridge,vlan}' not 'should_provide_{bridge,vlan}'

    The bridge_interface attribute should always be set, even if
    to None, since None is a valid value.

    The vlan attribute is compulsory if should_create_vlan is
    set.

    Closes-bug: 1612281
    Change-Id: I245f560156d596be14ef9181bfb881be9680c166

Changed in nova:
status: Fix Committed → Fix Released
Matt Riedemann (mriedem)
Changed in nova:
assignee: nobody → Daniel Berrange (berrange)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/354221
Reason: https://review.openstack.org/#/c/354143/ is merged.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.0.0b3

This issue was fixed in the openstack/nova 14.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.