Kuryr ignores CNI_CONTAINERID when serving requests

Bug #1731485 reported by Michal Dulko
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
Fix Released
Critical
Michal Dulko

Bug Description

Kuryr relies on pod name to identify VIFs attached to a pod, as that's easily accessible through k8s API. The problem with that is - kubelet and CNI rely on CNI_CONTAINERID. There's a common failure scenario when creating multiple pods simultaneously:

1. Kubelet sends ADD request with CNI_CONTAINERID=A.
2. ADD request fails.
3. Kubelet sends DEL request with CNI_CONTAINERID=A.
4. Kubelet sends ADD request with CNI_CONTAINERID=B.
5. ADD request succeeds.
6. DEL request #3 takes a long time, so kubelet sends another DEL with CNI_CONTAINERID=A.
7. Kuryr disconnects pod from the network.

Changed in kuryr-kubernetes:
assignee: nobody → Michal Dulko (michal-dulko-f)
status: New → In Progress
Revision history for this message
Michal Dulko (michal-dulko-f) wrote :

In CNI daemon patch [1] this is solved by saving last CNI_CONTAINERID into memory and only reacting to DEL requests that have the same CNI_CONTAINERID. I don't know how to solve this for regular CNI plugin case.

[1] https://review.openstack.org/#/c/515186

Changed in kuryr-kubernetes:
importance: Undecided → High
Changed in kuryr-kubernetes:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (master)

Reviewed: https://review.openstack.org/515186
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=2f65d993f39c8ef102b692f75a39b7355e2d7c7b
Submitter: Zuul
Branch: master

commit 2f65d993f39c8ef102b692f75a39b7355e2d7c7b
Author: Michał Dulko <email address hidden>
Date: Wed Oct 25 21:29:22 2017 +0200

    CNI split - introducing CNI daemon

    This commit implements basic CNI daemon service. The aim of this new
    entity is to increase scalability of CNI operations by moving watching
    for VIF to a separate process.

    This commit:
    * Introduces kuryr-daemon service
    * Implements communication between CNI driver and CNI daemon using HTTP
    * Consolidates watching for VIF on CNI side to a single Watcher that
      looks for all the pods on the node it is running on.
    * Solves bug 1731485 when running with CNI daemon.
    * Enables new service in DevStack plugin
    * Provides unit tests for new code.

    Follow up patches will include:
    - Documentation.
    - Support for running in containerized mode.

    To test the patch add `enable_service kuryr-daemon` to your DevStack's
    local.conf file.

    Partial-Bug: 1731485
    Co-Authored-By: Janonymous <email address hidden>
    Implements: blueprint cni-split-exec-daemon
    Change-Id: I1bd6406dacab0735a94474e146645c63d933be16

Changed in kuryr-kubernetes:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.