Kuryr ignores CNI_CONTAINERID when serving requests

Bug #1731485 reported by Michal Dulko on 2017-11-10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Michal Dulko

Bug Description

Kuryr relies on pod name to identify VIFs attached to a pod, as that's easily accessible through k8s API. The problem with that is - kubelet and CNI rely on CNI_CONTAINERID. There's a common failure scenario when creating multiple pods simultaneously:

1. Kubelet sends ADD request with CNI_CONTAINERID=A.
2. ADD request fails.
3. Kubelet sends DEL request with CNI_CONTAINERID=A.
4. Kubelet sends ADD request with CNI_CONTAINERID=B.
5. ADD request succeeds.
6. DEL request #3 takes a long time, so kubelet sends another DEL with CNI_CONTAINERID=A.
7. Kuryr disconnects pod from the network.

Changed in kuryr-kubernetes:
assignee: nobody → Michal Dulko (michal-dulko-f)
status: New → In Progress
Michal Dulko (michal-dulko-f) wrote :

In CNI daemon patch [1] this is solved by saving last CNI_CONTAINERID into memory and only reacting to DEL requests that have the same CNI_CONTAINERID. I don't know how to solve this for regular CNI plugin case.

[1] https://review.openstack.org/#/c/515186

Changed in kuryr-kubernetes:
importance: Undecided → High
Changed in kuryr-kubernetes:
importance: High → Critical

Reviewed: https://review.openstack.org/515186
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=2f65d993f39c8ef102b692f75a39b7355e2d7c7b
Submitter: Zuul
Branch: master

commit 2f65d993f39c8ef102b692f75a39b7355e2d7c7b
Author: Michał Dulko <email address hidden>
Date: Wed Oct 25 21:29:22 2017 +0200

    CNI split - introducing CNI daemon

    This commit implements basic CNI daemon service. The aim of this new
    entity is to increase scalability of CNI operations by moving watching
    for VIF to a separate process.

    This commit:
    * Introduces kuryr-daemon service
    * Implements communication between CNI driver and CNI daemon using HTTP
    * Consolidates watching for VIF on CNI side to a single Watcher that
      looks for all the pods on the node it is running on.
    * Solves bug 1731485 when running with CNI daemon.
    * Enables new service in DevStack plugin
    * Provides unit tests for new code.

    Follow up patches will include:
    - Documentation.
    - Support for running in containerized mode.

    To test the patch add `enable_service kuryr-daemon` to your DevStack's
    local.conf file.

    Partial-Bug: 1731485
    Co-Authored-By: Janonymous <email address hidden>
    Implements: blueprint cni-split-exec-daemon
    Change-Id: I1bd6406dacab0735a94474e146645c63d933be16

Changed in kuryr-kubernetes:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers