NetworkAmbiguous traceback in nova-compute logs even though it's a user error

Bug #1823198 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Matt Riedemann
Queens
Triaged
Low
Unassigned
Rocky
Triaged
Low
Unassigned
Stein
Fix Committed
Low
Matt Riedemann

Bug Description

There is a tempest negative test which is trying to attach a network interface to a server but it is not requesting a specific network or port:

http://logs.openstack.org/58/637058/15/check/nova-next/f4e8140/logs/tempest.txt.gz#_2019-04-04_00_07_20_973

2019-04-04 00:07:20.973 7584 INFO tempest.lib.common.rest_client [req-47bc59bc-01de-450c-9185-3c0b9dc4bf51 ] Request (AttachInterfacesTestJSON:test_create_list_show_delete_interfaces_by_network_port): 400 POST https://104.130.239.125/compute/v2.1/servers/7f24a95b-b5be-4269-980b-a92f3c07c7ca/os-interface 0.678s
2019-04-04 00:07:20.974 7584 DEBUG tempest.lib.common.rest_client [req-47bc59bc-01de-450c-9185-3c0b9dc4bf51 ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"interfaceAttachment": {}}
    Response - Headers: {'date': 'Thu, 04 Apr 2019 00:07:20 GMT', 'server': 'Apache/2.4.29 (Ubuntu)', 'openstack-api-version': 'compute 2.1', 'x-openstack-nova-api-version': '2.1', 'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version', 'content-type': 'application/json; charset=UTF-8', 'content-length': '115', 'x-openstack-request-id': 'req-47bc59bc-01de-450c-9185-3c0b9dc4bf51', 'x-compute-request-id': 'req-47bc59bc-01de-450c-9185-3c0b9dc4bf51', 'connection': 'close', 'status': '400', 'content-location': 'https://104.130.239.125/compute/v2.1/servers/7f24a95b-b5be-4269-980b-a92f3c07c7ca/os-interface'}
        Body: b'{"badRequest": {"code": 400, "message": "Multiple possible networks found, use a Network ID to be more specific."}}' _log_request_full /opt/stack/new/tempest/tempest/lib/common/rest_client.py:440

Which results in a NetworkAmbiguous error response from the compute API as seen above.

The problem is this is a synchronous call from nova-api to nova-compute and results in an exception traceback in the nova-compute logs:

logs.openstack.org/58/637058/15/check/nova-next/f4e8140/logs/screen-n-cpu.txt.gz?level=TRACE#_Apr_04_00_07_20_968719

Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server [None req-47bc59bc-01de-450c-9185-3c0b9dc4bf51 tempest-AttachInterfacesTestJSON-791222649 tempest-AttachInterfacesTestJSON-791222649] Exception during message handling: nova.exception.NetworkAmbiguous: Multiple possible networks found, use a Network ID to be more specific.
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server Traceback (most recent call last):
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/exception_wrapper.py", line 79, in wrapped
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server function_name, call_dict, binary, tb)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server self.force_reraise()
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server raise value
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/exception_wrapper.py", line 69, in wrapped
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/utils.py", line 1349, in decorated_function
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/manager.py", line 213, in decorated_function
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server self.force_reraise()
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server raise value
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/manager.py", line 201, in decorated_function
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/manager.py", line 6651, in attach_interface
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server bind_host_id=bind_host_id, tag=tag)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/network/neutronv2/api.py", line 1697, in allocate_port_for_instance
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server bind_host_id=bind_host_id, attach=True)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/network/neutronv2/api.py", line 1067, in allocate_for_instance
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server context, instance, neutron, requested_networks, ordered_networks)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/network/neutronv2/api.py", line 940, in _validate_requested_network_ids
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server raise exception.NetworkAmbiguous(msg)
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server nova.exception.NetworkAmbiguous: Multiple possible networks found, use a Network ID to be more specific.
Apr 04 00:07:20.968719 ubuntu-bionic-rax-iad-0004701322 nova-compute[32008]: ERROR oslo_messaging.rpc.server

We shouldn't traceback on this in the nova-compute logs because it's a user-error, but we also probably shouldn't even be getting this far if the API were validating the request body properly.

According to the API reference:

https://developer.openstack.org/api-ref/compute/?expanded=create-interface-detail#create-interface

port_id and net_id are both optional but mutually exclusive.

The problem is both are optional because if you don't request anything, nova-compute looks up the available network for the requesting tenant and picks one, but if there is more than one it's an error since the user needs to specify a port or network to use since nova can't decide (and there is no default network - see https://blueprints.launchpad.net/nova/+spec/deal-with-networkambiguous-error).

So at the very least we should probably change the NetworkAmbiguous error to a warning and not traceback.

This is an extremely latent bug so it's just marked low severity.

Matt Riedemann (mriedem)
Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
no longer affects: nova/pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/650077

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/650077
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ade6c9393632e830c2368825568769853fce3b99
Submitter: Zuul
Branch: master

commit ade6c9393632e830c2368825568769853fce3b99
Author: Matt Riedemann <email address hidden>
Date: Thu Apr 4 13:04:33 2019 -0400

    Handle Invalid exceptions as expected in attach_interface

    The bug prompting this is a tempest test which is requesting
    a port attachment to a server but not specifying a port or
    network to use, so nova-compute looks for a valid network
    and finds there are two and raises NetworkAmbiguous. This
    is treated as a 400 error in the API but because this is a
    synchronous RPC call from nova-api to nova-compute,
    oslo.messaging logs an exception traceback for the unexpected
    error. That traceback is pretty gross in the compute logs for
    something that is a user error and the cloud operator has
    nothing to do to fix it.

    We can handle the traceback by registering our expected
    exceptions for the attach_interface method with oslo.messaging,
    which is what this change does.

    While looking to just add NetworkAmbiguous it became clear that
    lots of different user errors can be raised from this method
    and none of those should result in a traceback, so this change
    just expects Invalid and its subclasses.

    The one exception is InterfaceAttachFailed which is raised when
    something in allocate_port_for_instance or driver.attach_interface
    fails. That is an unexpected situation so the parent class for
    InterfaceAttachFailed is changed from Invalid to NovaException so
    it continues to be traced in the logs as an exception.
    InterfaceAttachFailedNoNetwork is kept as Invalid since it is a
    user error (trying to attach an interface when the user has no
    access to any networks).

    test_tagged_attach_interface_raises is adjusted to show the
    ExpectedException handling for one of the Invalid cases.

    Change-Id: I927ff1d8c8f45405833d6012b7d7af37b98b10a0
    Closes-Bug: #1823198

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/672384

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/672384
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e9261764c90403d1b2e33790be96efea89068427
Submitter: Zuul
Branch: stable/stein

commit e9261764c90403d1b2e33790be96efea89068427
Author: Matt Riedemann <email address hidden>
Date: Thu Apr 4 13:04:33 2019 -0400

    Handle Invalid exceptions as expected in attach_interface

    The bug prompting this is a tempest test which is requesting
    a port attachment to a server but not specifying a port or
    network to use, so nova-compute looks for a valid network
    and finds there are two and raises NetworkAmbiguous. This
    is treated as a 400 error in the API but because this is a
    synchronous RPC call from nova-api to nova-compute,
    oslo.messaging logs an exception traceback for the unexpected
    error. That traceback is pretty gross in the compute logs for
    something that is a user error and the cloud operator has
    nothing to do to fix it.

    We can handle the traceback by registering our expected
    exceptions for the attach_interface method with oslo.messaging,
    which is what this change does.

    While looking to just add NetworkAmbiguous it became clear that
    lots of different user errors can be raised from this method
    and none of those should result in a traceback, so this change
    just expects Invalid and its subclasses.

    The one exception is InterfaceAttachFailed which is raised when
    something in allocate_port_for_instance or driver.attach_interface
    fails. That is an unexpected situation so the parent class for
    InterfaceAttachFailed is changed from Invalid to NovaException so
    it continues to be traced in the logs as an exception.
    InterfaceAttachFailedNoNetwork is kept as Invalid since it is a
    user error (trying to attach an interface when the user has no
    access to any networks).

    test_tagged_attach_interface_raises is adjusted to show the
    ExpectedException handling for one of the Invalid cases.

    Change-Id: I927ff1d8c8f45405833d6012b7d7af37b98b10a0
    Closes-Bug: #1823198
    (cherry picked from commit ade6c9393632e830c2368825568769853fce3b99)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.2

This issue was fixed in the openstack/nova 19.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.