Race between active_vif retries and pod deletions

Bug #1847441 reported by Luis Tomas Bolivar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
Fix Released
Undecided
Luis Tomas Bolivar

Bug Description

There is a race when the vif handler is waiting for the Neutron Ports to become active (retries) with the deletion of that pod (vif handler on_delete), leading to the next error:

2019-10-08 15:52:42.983 1 DEBUG neutronclient.v2_0.client [-] Error message: {"NeutronError": {"message": "Port d3b2d608-19cd-4ef4-b726-b98119ef0cae could not be found.", "type": "PortNotFound", "detail": ""}} _handle_fault_response /usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py:259ESC[00m
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy VIFHandler: PortNotFoundClient: Port d3b2d608-19cd-4ef4-b726-b98119ef0cae could not be found.
Neutron server returns request_ids: ['req-e97f896f-0b75-4442-a0ed-b51958a6d18b']
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry Traceback (most recent call last):
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/retry.py", line 56, in __call__
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry self._handler(event)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 75, in __call__
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry self.on_present(obj)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/vif.py", line 137, in on_present
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry self._drv_vif_pool.activate_vif(pod, vif)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1052, in activate_vif
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry self._vif_drvs[vif_drv_alias].activate_vif(pod, vif)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 169, in activate_vif
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry self._drv_vif.activate_vif(pod, vif)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/neutron_vif.py", line 90, in activate_vif
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry port = neutron.show_port(vif.id).get('port')
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 799, in show_port
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry return self.get(self.port_path % (port), params=_params)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 354, in get
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry headers=headers, params=params)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 331, in retry_request
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry headers=headers, params=params)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 294, in do_request
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry self._handle_fault_response(status_code, replybody, resp)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 269, in _handle_fault_response
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry exception_handler_v20(status_code, error_body)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 93, in exception_handler_v20
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry request_ids=request_ids)
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry PortNotFoundClient: Port d3b2d608-19cd-4ef4-b726-b98119ef0cae could not be found.
2019-10-08 15:52:42.984 1 ERROR kuryr_kubernetes.handlers.retry Neutron server returns request_ids: ['req-e97f896f-0b75-4442-a0ed-b51958a6d18b']

This happens if the pod gets deleted before getting to running and the next retry for activating the port happens after the deletion action.

Changed in kuryr-kubernetes:
assignee: nobody → Luis Tomas Bolivar (ltomasbo)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (master)

Fix proposed to branch: master
Review: https://review.opendev.org/687504

Changed in kuryr-kubernetes:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (master)

Reviewed: https://review.opendev.org/687504
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=bbf6c29ec453880c92df9018c3dec83478e60f08
Submitter: Zuul
Branch: master

commit bbf6c29ec453880c92df9018c3dec83478e60f08
Author: Luis Tomas Bolivar <email address hidden>
Date: Wed Oct 9 11:10:58 2019 +0200

    Avoid race between activating vif and pod deletion

    Change-Id: Ie4dfca4420820d15c9cb6e78f7ca121fdb0e1658
    Closes-Bug: 1847441

Changed in kuryr-kubernetes:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/687625

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (stable/train)

Reviewed: https://review.opendev.org/687625
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=8b6053fea34518f6a8904e45d7cea8b895c51495
Submitter: Zuul
Branch: stable/train

commit 8b6053fea34518f6a8904e45d7cea8b895c51495
Author: Luis Tomas Bolivar <email address hidden>
Date: Wed Oct 9 11:10:58 2019 +0200

    Avoid race between activating vif and pod deletion

    Change-Id: Ie4dfca4420820d15c9cb6e78f7ca121fdb0e1658
    Closes-Bug: 1847441
    (cherry picked from commit bbf6c29ec453880c92df9018c3dec83478e60f08)

tags: added: in-stable-train
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.