EEXIST when kubelet triggers retry of ADD request

Bug #1730644 reported by Michal Dulko
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Michal Dulko

Bug Description

On errors and timeouts kubelet is triggering a retry. Now if error happened after we've moved host vETH to host namespace all the future retries will fail due to pyroute2 raising exception EEXIST. To mitigate that we need to catch that exception and simply ignore it as vETH is already in the host namespace and we can safely proceed.

Revision history for this message
Michal Dulko (michal-dulko-f) wrote :

Okay, I'd say this is low priority, I think I've found the main reason we were even getting into this situation.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (master)

Fix proposed to branch: master

Changed in kuryr-kubernetes:
assignee: nobody → Michal Dulko (michal-dulko-f)
status: New → In Progress
Changed in kuryr-kubernetes:
importance: Undecided → Critical
milestone: none → queens-rc-final
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (master)

Submitter: Zuul
Branch: master

commit 9ac4df32e2e2d3111ca0d4e57ba8b22bcc1974c8
Author: Michał Dulko <email address hidden>
Date: Tue Nov 7 12:03:27 2017 +0100

    Fix kubelet retries issues

    When an error happens during kubelet's ADD request kubelet will trigger
    a retry. If aforementioned failure happened after the host-side veth
    interface was moved into the host netns, all further kubelet retries of
    the request will fail with conflict, as it will be impossible to move
    another iface of that name into the host netns.

    This commit aims to fix the issue by checking for existence of
    conflicting interface in the host netns and removing it if needed.

    Change-Id: I1596585c8b076d09a7f8c854bb524c2374d804e8
    Closes-Bug: 1730644

Changed in kuryr-kubernetes:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kuryr-kubernetes 0.4.0

This issue was fixed in the openstack/kuryr-kubernetes 0.4.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers