calico-kube-controllers pod sometimes moves to worker node

Bug #1848773 reported by Yang Liu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Joseph Richard

Bug Description

Brief Description
-----------------
calico-kube-controllers pod sometimes moves to worker node upon lock/unlock standby controller

Severity
--------
Minor

Steps to Reproduce
------------------
- calico-kube-controllers pod is initially on controller-0
- system host-swact controller-0 (pod is still on controller-0)
- lock/unlock controller-0 (now active controller is controller-1)
- check calico-kube-controllers pod

Expected Behavior
------------------
- calico-kube-controllers pod should move to controller-1

Actual Behavior
----------------
- calico-kube-controllers pod moved to worker node

Reproducibility
---------------
Intermittent

System Configuration
--------------------
Multi-node system

Branch/Pull Time/Commit
-----------------------
2019-10-09_20-00-00

Last Pass
---------
Not sure

Timestamp/Logs
--------------
# Lock controller-0
[2019-10-10 19:00:34,544] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-lock controller-0'

# Unlock controller-0
[2019-10-10 19:02:11,611] 311 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

# calico controller pod moved to worker:
[2019-10-10 19:08:48,602] 311 DEBUG MainThread ssh.send :: Send 'kubectl get pod -o=wide --all-namespaces'
kube-system calico-kube-controllers-7f985db75c-lz8ph 1/1 Running 0 35m dead:beef::bc1b:6533:4fd4:e140 compute-2 <none> <none>

Test Activity
-------------
Sanity

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Minor issue as there is no system impact, but architecturally speaking, this pod should be tied to a label to run on the controller nodes only. Would be nice to fix for stx.3.0

tags: added: stx.3.0 stx.containers stx.networking
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Joseph Richard (josephrichard)
Yang Liu (yliu12)
tags: added: stx.retestneeded
Ghada Khalil (gkhalil)
Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)

Fix proposed to branch: master
Review: https://review.opendev.org/695588

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/695588
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=f06c354e0843173aaae9d15d0f79c33f8e29e286
Submitter: Zuul
Branch: master

commit f06c354e0843173aaae9d15d0f79c33f8e29e286
Author: Joseph Richard <email address hidden>
Date: Thu Nov 21 14:06:35 2019 -0500

    restrict calico-kube-controllers to master

    This commit adds nodeSelector field of node-role.kubernetes.io/master
    calico-kube-controllers pod, so that it is only scheduled on controller
    nodes.

    Change-Id: Ia727843f25a0dbd92c3462bd23c0c1e723a82ece
    Closes-bug: 1848773
    Signed-off-by: Joseph Richard <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Yang Liu (yliu12) wrote :

This issue is no longer seen in recent sanity.

tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.