neutron_lbaas scenario tests are failing

Bug #1802438 reported by YAMAMOTO Takashi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-midonet
In Progress
Critical
YAMAMOTO Takashi
tags: added: gate-failure lbaas
Changed in networking-midonet:
importance: Undecided → Critical
Changed in networking-midonet:
status: New → In Progress
assignee: nobody → YAMAMOTO Takashi (yamamoto)
Revision history for this message
YAMAMOTO Takashi (yamamoto) wrote :
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :
Download full text (7.2 KiB)

Hello Yamamoto:

I've reviewed the test results you have submitted in this bug report. I reviewed specifically the pool creation, because this is affected by the patch you are trying to revert.

In all cases failing, I don't see the pool creation call is returning any error or something wrong [1].

For example, in a previous test execution [2], I can see similar results [3].

Can you explain this revert?

Thank you in advance.

[1]
test_health_monitor_basic:
2018-11-07 13:32:37,586 31013 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"pool": {"protocol": "TCP", "lb_algorithm": "ROUND_ROBIN", "listener_id": "494fa76a-fad8-436f-bfc8-0c256df172f4"}}
    Response - Headers: {u'content-type': 'application/json', 'status': '201', u'connection': 'close', 'content-location': 'http://10.4.70.20:9696/v2.0/lbaas/pools', u'x-openstack-request-id': 'req-7311c333-0246-41e9-8615-b21417db14e7', u'date': 'Wed, 07 Nov 2018 13:32:37 GMT', u'content-length': '410'}
        Body: {"pool": {"lb_algorithm": "ROUND_ROBIN", "protocol": "TCP", "description": "", "admin_state_up": true, "loadbalancers": [{"id": "afe147d9-5fe3-4abb-8a0a-c8123f539fa8"}], "tenant_id": "f82e68dd776d449bb8a0d2ec829525cd", "session_persistence": null, "healthmonitor_id": null, "listeners": [{"id": "494fa76a-fad8-436f-bfc8-0c256df172f4"}], "members": [], "id": "76220cf7-37d3-4ad8-aee2-74346924592e", "name": ""}}

test_listener_basic:
2018-11-07 13:22:11,249 31011 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"pool": {"protocol": "TCP", "lb_algorithm": "ROUND_ROBIN", "listener_id": "a49cb03e-e889-4135-8f4a-4ea8028f88fa"}}
    Response - Headers: {u'content-type': 'application/json', 'status': '201', u'connection': 'close', 'content-location': 'http://10.4.70.20:9696/v2.0/lbaas/pools', u'x-openstack-request-id': 'req-93f25447-a4e4-41a2-8caa-967dae523133', u'date': 'Wed, 07 Nov 2018 13:22:11 GMT', u'content-length': '410'}
        Body: {"pool": {"lb_algorithm": "ROUND_ROBIN", "protocol": "TCP", "description": "", "admin_state_up": true, "loadbalancers": [{"id": "ebdd24c6-676d-4fab-8773-121a2b88e1d8"}], "tenant_id": "65f20950846b4f5ba21dcc2b2f591153", "session_persistence": null, "healthmonitor_id": null, "listeners": [{"id": "a49cb03e-e889-4135-8f4a-4ea8028f88fa"}], "members": [], "id": "5134ac19-ac3c-4fe0-b4f8-100ac3b7a2da", "name": ""}}

test_load_balancer_basic:
2018-11-07 13:26:16,068 31011 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"pool": {"protocol": "TCP", "lb_algorithm": "ROUND_ROBIN", "listener_id": "0f53d1ab-fa7c-4d58-9605-e65ca20f05a8"}}
    Response - Headers: {u'content-type': 'application/json', 'status': '201', u'connection': 'close', 'content-location': 'http://10.4.70.20:9696/v2.0/lbaas/pools', u'x-openstack-request-id': 'req-bdbf4f2e-0de1-4dd0-8ee8-193a386d6cf1', u'date': 'Wed, 07 Nov 2018 13:26:16 G...

Read more...

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Yamamoto:

The error in networking-midonet-tempest-multinode-ml2-full is not consistent. In [1] (09/11/2018, before the patch was merged), you can see those 5 test cases passing.

[1] http://logs.openstack.org/87/199387/127/check/networking-midonet-tempest-multinode-ml2-full/67fbe16/logs/testr_results.html.gz

Revision history for this message
YAMAMOTO Takashi (yamamoto) wrote :

Rodolfo,

thank you for investigating.
the log you mentioned in #3 was with Depends-On on the revert.

Revision history for this message
YAMAMOTO Takashi (yamamoto) wrote :
Download full text (4.3 KiB)

Rodolfo,

I took a look further.
objects passed to drivers seems very different after the change in question.

i guess listener_id field change is what broke midonet. [1]
but i'm not sure if it's the only one.
[1] https://github.com/midonet/midonet/blob/master/docs/neutron_translation.md#create-14

i believe this kind of changes should not be backported.

before the change

http://logs.openstack.org/87/199387/127/check/networking-midonet-tempest-multinode-ml2-full/67fbe16/logs/screen-neutron-api.txt.gz

Nov 09 07:54:18.546163 ubuntu-xenial-vexxhost-sjc1-0000444132 neutron-server[16814]: DEBUG midonetclient.api_lib [None req-c61364a3-d27f-4b78-b608-116f7a32d7eb tempest-TestHealthMonitors-2059262633 tempest-TestHealthMonitors-2059262633] do_request: uri=http://38.108.68.71:8181/midonet-api/neutron/pools_v2, method=POST {{(pid=16942) do_request /usr/lib/python2.7/dist-packages/midonetclient/api_lib.py:61}}
Nov 09 07:54:18.547155 ubuntu-xenial-vexxhost-sjc1-0000444132 neutron-server[16814]: DEBUG midonetclient.api_lib [None req-c61364a3-d27f-4b78-b608-116f7a32d7eb tempest-TestHealthMonitors-2059262633 tempest-TestHealthMonitors-2059262633] do_request: body={'lb_algorithm': 'ROUND_ROBIN', 'healthmonitor_id': None, 'listener_id': u'8e24ce2b-d78c-4f92-9771-29047fd633b8', 'protocol': 'TCP', 'description': '', 'admin_state_up': True, 'loadbalancers': [{'id': u'7c96b121-ce13-44b6-8ad1-c36a8e7f2c33'}], 'tenant_id': 'ffe15c520fcb48c6a80d9d73331f288d', 'l7_policies': [], 'session_persistence': None, 'listener': {'l7_policies': [], 'protocol': 'TCP', 'description': '', 'default_tls_container_id': None, 'default_pool': {'lb_algorithm': 'ROUND_ROBIN', 'healthmonitor_id': None, 'protocol': 'TCP', 'description': '', 'name': '', 'admin_state_up': True, 'tenant_id': 'ffe15c520fcb48c6a80d9d73331f288d', 'l7_policies': [], 'session_persistence': None, 'listener': {'l7_policies': [], 'protocol': 'TCP', 'description': '', 'default_tls_container_id': None, 'default_pool': None, 'tenant_id': 'ffe15c520fcb48c6a80d9d73331f288d', 'admin_state_up': True, 'connection_limit': -1, 'loadbalancer_id': '7c96b121-ce13-44b6-8ad1-c36a8e7f2c33', 'default_pool_id': '86ff15c9-7a85-4eb7-a662-0de9ace39ce0', 'operating_status': 'ONLINE', 'sni_containers': [], 'provisioning_status': 'ACTIVE', 'protocol_port': 80, 'id': '8e24ce2b-d78c-4f92-9771-29047fd633b8', 'loadbalancer': {'stats': {'bytes_in': 0, 'total_connections': 0, 'active_connections': 0, 'bytes_out': 0, 'loadbalancer_id': '7c96b121-ce13-44b6-8ad1-c36a8e7f2c33', 'loadbalancer': {'stats': {'bytes_in': 0, 'total_connections': 0, 'active_connections': 0, 'bytes_out': 0, 'loadbalancer_id': '7c96b121-ce13-44b6-8ad1-c36a8e7f2c33', 'loadbalancer': None}, 'description': '', 'admin_state_up': True, 'tenant_id': 'a10b4b4ddd574eeaba818951db6b7a2c', 'provisioning_status': 'PENDING_UPDATE', 'id': '7c96b121-ce13-44b6-8ad1-c36a8e7f2c33', 'vip_subnet_id': 'e13974ff-9e93-41d9-892b-24825b28aed1', 'listeners': [], 'vip_address': '10.1.0.3', 'vip_port_id'

after the change

http://logs.openstack.org/87/199387/125/check/networking-midonet-tempest-multinode-ml2-full/4e934d4/logs/screen-neutron-api.txt.gz

Nov 07 13:20:05.037206 ubuntu-xenial-limest...

Read more...

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Yamamoto:

Where are those tests located? I ask this because those failing tests are located in lbaas repository. Those tests are passing in lbaas CI, as you can see in any patch submitted for this repository. How are you linking to those tests? I think you can't do this directly.

This patch was submitted to solve an existing problem in lbaas, related to a patch submitted to speed up the reading operations. This request was done by a client running a LTS version, that's why we have cherry-picked this patch to previous versions. Both, the patch to improve the read operations and mine are passing in any lbaas CI.

Reviewing the logs, what I can see in what you are pointing out in your previous comment: midonet is not passing the corrects arguments when creating a pool. Executing test TestLoadBalancerBasic.test_load_balancer_basic:
- lbaas (using octavia) creating a pool: http://paste.openstack.org/show/735880/
- midonet creating a pool: http://paste.openstack.org/show/735879/

As you said, midonet is passing the "listeners" parameter populated but "listener_id" empty. In lbaas, only "listener_id" is passed with value (not "listeners"). As you can see in [1], when a pool is created, "listener_id" and "listeners" are used to create a list of listener objects. Then both parameters are deleted from the object creation arguments [2]. Once the pool object is created, the listeners are updated and the "listeners" parameter is populated.

Please, recheck how midonet is creating the pool object. BTW, in [3] (before my patch) you are passing in "listener" the whole object, as you can see in [4]. This is incorrect. Have you seen the length of this request?

Regards.

[1] https://github.com/openstack/neutron-lbaas/blob/master/neutron_lbaas/services/loadbalancer/plugin.py#L667
[2] https://github.com/openstack/neutron-lbaas/blob/master/neutron_lbaas/services/loadbalancer/plugin.py#L700-L701
[3] http://logs.openstack.org/87/199387/127/check/networking-midonet-tempest-multinode-ml2-full/67fbe16/logs/screen-neutron-api.txt.gz
[4] http://paste.openstack.org/show/735881/

Revision history for this message
YAMAMOTO Takashi (yamamoto) wrote :

Rodolfo,

the configuration in the job works this way:
neutron-lbaas --(lbaas driver api)--> midonet driver --(rest api)--> midonet

"midonetclient.api_lib" logs are about the "(rest api)" part above.
midonet driver basically just pass-through the object given via the driver api.
the change in question changed the object passed in the "(lbaas driver api)". i.e. it broke the driver api.

those tempest tests are from lbaas repo.
the job in question runs those tempest tests against a deployment with midonet backend.
tempest tests are expected to be backend agnostic in general.

the patch in question is ok for the reference implementation. it's why it won't break lbaas CI.
it happened to break the other driver (midonet) though.

i don't understand why you think passing the whole object is incorrect.
yes, i know the length of the request. it's inefficient and i don't like it.
(i vaguely remember that i even filed a bug complaining those objects are too verbose and redundant
while ago.)
but it works as far as it has the fields used by the backend.
midonet driver can trim them down as an optimization but it's a separate topic.

Revision history for this message
YAMAMOTO Takashi (yamamoto) wrote :

btw, i'm not necessarily against the change in question.
my claim is that it should be done differently.
that is, either:
1. avoid affecting the driver api
or
2. coordinate with driver maintainers, including out of tree ones.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron-lbaas 14.0.0.0rc1

This issue was fixed in the openstack/neutron-lbaas 14.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.