error deleting namespace resources due to a race with ports recycling

Bug #1786447 reported by Luis Tomas Bolivar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
Fix Released
Critical
Luis Tomas Bolivar

Bug Description

When a namespace is deleted all its associated resources (pods, svc, rc, ...) are deleted too. This creates a race when pooling is enabled as when pods are deleted its associated ports are marked to be reused and put into the recyclable_ports dict.

The problem arises as this dict of ports to be reused is checked periodically (15 second interval by default) and the function to remove all the ports associated to a given network can be executed at the same time, therefore calling function trigger_return_to_pool at the same time, which may lead to skipping released ports. Thus making network deletion to fail and consequently the associated kuryrnet CRD.

In addition, if several namespaces are deleted concurrently, the handler will call delete_network_pools for each namespace subnet, which leads to several calls to _trigger_return_to_pool. The second call will end up reading ports from the copy of recyclable_ports to available_ports, leading to duplicates port_ids.

Changed in kuryr-kubernetes:
assignee: nobody → Luis Tomas Bolivar (ltomasbo)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (master)

Fix proposed to branch: master
Review: https://review.openstack.org/590739

Changed in kuryr-kubernetes:
status: New → In Progress
description: updated
Changed in kuryr-kubernetes:
importance: Undecided → Critical
description: updated
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (master)

Reviewed: https://review.openstack.org/590739
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=f05cb423e00d8f7a943845f306653c42755d4df7
Submitter: Zuul
Branch: master

commit f05cb423e00d8f7a943845f306653c42755d4df7
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Aug 10 12:14:37 2018 +0200

    Ensure delete_network_pools include all the ports

    This patch ensures all the ports belonging to the network that
    is being emptied are considered, i.e., it ensures the ports
    associated to the pods being deleted as part of the namespace
    deletion are cleaned up too.

    Closes-Bug: 1786447
    Implements: blueprint openshift-project-isolation-support
    Change-Id: I42d058dc79047324ef4dbad6385346f01b855f37

Changed in kuryr-kubernetes:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kuryr-kubernetes 0.5.0

This issue was fixed in the openstack/kuryr-kubernetes 0.5.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.