NetworkPolicyHandler: KeyError when deleting a network policy

Bug #1860106 reported by Luis Tomas Bolivar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
Fix Released
Undecided
Luis Tomas Bolivar

Bug Description

When having several network policies in parallel, and deleting one of them, the next KeyError may happen:
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy NetworkPolicyHandler: KeyError: ('140f29c1-2980-405a-9572-6251477a658d', '47c1159c-5bba-44f5-b2a5-f6dae011f0db')
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry Traceback (most recent call last):
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 78, in __call__
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry self._handler(event)
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 77, in __call__
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry self.on_deleted(obj)
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/policy.py", line 130, in on_deleted
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry self._drv_vif_pool.remove_sg_from_pools(crd_sg, net_id)
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1077, in remove_sg_from_pools
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry vif_drv.remove_sg_from_pools(sg_id, net_id)
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 313, in remove_sg_from_pools
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry del self._available_ports_pools[pool_key][sg_key]
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry KeyError: ('140f29c1-2980-405a-9572-6251477a658d', '47c1159c-5bba-44f5-b2a5-f6dae011f0db')
2020-01-16 17:08:48.629 1 ERROR kuryr_kubernetes.handlers.retry
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging [-] Failed to handle event {'type': 'DELETED', 'object': {'kind': 'NetworkPolicy', 'apiVersion': 'networking.k8s.io/v1', 'metadata': {'name': 'all
ow-to-server-a-pod-selector', 'namespace': 'network-policy-6817', 'selfLink': '/apis/networking.k8s.io/v1/namespaces/network-policy-6817/networkpolicies/allow-to-server-a-pod-selector', 'uid': '7ab13f32-861e-45b9
-866d-9baa07b6ddd0', 'resourceVersion': '153310', 'generation': 1, 'creationTimestamp': '2020-01-16T17:03:25Z', 'annotations': {'kuryrnetpolicy_selfLink': '/apis/openstack.org/v1/namespaces/network-policy-6817/kuryrnetpolicies/np-allow-to-server-a-pod-selector'}}, 'spec': {'podSelector': {'matchLabels': {'pod-name': 'client-a'}}, 'egress': [{'ports': [{'protocol': 'UDP', 'port': 53}]}, {'to': [{'podSelector': {'matchLabe
ls': {'pod-name': 'server'}}}]}], 'policyTypes': ['Egress']}}}: KeyError: ('140f29c1-2980-405a-9572-6251477a658d', '47c1159c-5bba-44f5-b2a5-f6dae011f0db')
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last):
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging self._handler(event)
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 78, in __call__
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging self._handler(event)
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 77, in __call__
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging self.on_deleted(obj)
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/policy.py", line 130, in on_deleted
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging self._drv_vif_pool.remove_sg_from_pools(crd_sg, net_id)
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 1077, in remove_sg_from_pools
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging vif_drv.remove_sg_from_pools(sg_id, net_id)
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/local/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/vif_pool.py", line 313, in remove_sg_from_pools
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging del self._available_ports_pools[pool_key][sg_key]
2020-01-16 17:08:48.637 1 ERROR kuryr_kubernetes.handlers.logging KeyError: ('140f29c1-2980-405a-9572-6251477a658d', '47c1159c-5bba-44f5-b2a5-f6dae011f0db')

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (master)

Fix proposed to branch: master
Review: https://review.opendev.org/703054

Changed in kuryr-kubernetes:
assignee: nobody → Luis Tomas Bolivar (ltomasbo)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (master)

Reviewed: https://review.opendev.org/703054
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=453e0d8f35af23f42ef5f72f9f401c5be47488c3
Submitter: Zuul
Branch: master

commit 453e0d8f35af23f42ef5f72f9f401c5be47488c3
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Jan 17 11:43:14 2020 +0100

    Avoid KeyError when deleting NPs

    It seems that when an namespace deletion fails for some reason and
    it is retried, if there are other NPs created, some of the unused
    ports can be reused and moved from one sg_key to another, therefore
    raising the KeyError on the retry.

    This patch ensure that if the sg_key was already deleted, the retry
    skips it instead of failing

    Change-Id: Ifcebf7660861738128606342dc6dab00186878b5
    Closes-Bug: 1860106

Changed in kuryr-kubernetes:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/703505

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (stable/train)

Reviewed: https://review.opendev.org/703505
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=461fb381985d9d55c769eda2d7ed7f757eb8d132
Submitter: Zuul
Branch: stable/train

commit 461fb381985d9d55c769eda2d7ed7f757eb8d132
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Jan 17 11:43:14 2020 +0100

    Avoid KeyError when deleting NPs

    It seems that when an namespace deletion fails for some reason and
    it is retried, if there are other NPs created, some of the unused
    ports can be reused and moved from one sg_key to another, therefore
    raising the KeyError on the retry.

    This patch ensure that if the sg_key was already deleted, the retry
    skips it instead of failing

    Change-Id: Ifcebf7660861738128606342dc6dab00186878b5
    Closes-Bug: 1860106
    (cherry picked from commit 453e0d8f35af23f42ef5f72f9f401c5be47488c3)

tags: added: in-stable-train
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.