Containers: cluster IP address not assigned on computes after lock/unlock - I/F name too long

Bug #1817593 reported by Yang Liu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Teresa Ho

Bug Description

Brief Description
-----------------
garbd pod is in CrashLoopBackOff status since install and config

Severity
--------
Major

Steps to Reproduce
------------------
1. install and config platform
2. deploy stx-openstack
3. check pods status

Expected Behavior
------------------
3. all pods are healthy

Actual Behavior
----------------
[wrsroot@controller-0 ~(keystone_admin)]$ kubectl get pods --all-namespaces -o wide | grep -v -e Completed -e Running
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
openstack osh-openstack-garbd-garbd-5744f5f85-9t5lz 0/1 CrashLoopBackOff 116 10h 172.16.4.2 compute-1 <none>

Reproducibility
---------------
Reproducible (maybe, seen 2/2 times on a regular system)

System Configuration
--------------------
Multi-node system

Branch/Pull Time/Commit
-----------------------
f/stein as of 2019-02-21

Timestamp/Logs
--------------
{"log":"2019-02-25 15:39:15.012 INFO: protonet asio version 0\n","stream":"stderr","time":"2019-02-25T15:39:15.012273135Z"}
{"log":"2019-02-25 15:39:15.012 INFO: Using CRC-32C for message checksums.\n","stream":"stderr","time":"2019-02-25T15:39:15.012417963Z"}
{"log":"2019-02-25 15:39:15.012 INFO: backend: asio\n","stream":"stderr","time":"2019-02-25T15:39:15.01245371Z"}
{"log":"2019-02-25 15:39:15.012 INFO: gcomm thread scheduling priority set to other:0 \n","stream":"stderr","time":"2019-02-25T15:39:15.012475804Z"}
{"log":"2019-02-25 15:39:15.012 WARN: access file(./gvwstate.dat) failed(No such file or directory)\n","stream":"stderr","time":"2019-02-25T15:39:15.012588844Z"}
{"log":"2019-02-25 15:39:15.012 INFO: restore pc from disk failed\n","stream":"stderr","time":"2019-02-25T15:39:15.012594681Z"}
{"log":"2019-02-25 15:39:15.012 INFO: GMCast version 0\n","stream":"stderr","time":"2019-02-25T15:39:15.012702816Z"}
{"log":"2019-02-25 15:39:35.046 WARN: Failed to resolve tcp://mariadb-server-0.mariadb-discovery.openstack.svc.cluster.local:4567\n","stream":"stderr","time":"2019-02-25T15:39:35.046357123Z"}
{"log":"2019-02-25 15:39:55.071 WARN: Failed to resolve tcp://mariadb-server-1.mariadb-discovery.openstack.svc.cluster.local:4567\n","stream":"stderr","time":"2019-02-25T15:39:55.071237092Z"}
{"log":"2019-02-25 15:39:55.071 INFO: (816dbf37, 'tcp://0.0.0.0:4567&#39;) listening at tcp://0.0.0.0:4567\n","stream":"stderr","time":"2019-02-25T15:39:55.071427289Z"}
{"log":"2019-02-25 15:39:55.071 INFO: (816dbf37, 'tcp://0.0.0.0:4567&#39;) multicast: , ttl: 1\n","stream":"stderr","time":"2019-02-25T15:39:55.071432896Z"}
{"log":"2019-02-25 15:39:55.071 INFO: EVS version 0\n","stream":"stderr","time":"2019-02-25T15:39:55.07169498Z"}
{"log":"2019-02-25 15:39:55.071 INFO: gcomm: connecting to group 'mariadb-server_openstack', peer 'mariadb-server-0.mariadb-discovery.openstack.svc.cluster.local:,mariadb-server-1.mariadb-discovery.openstack.svc.cluster.local:'\n","stream":"stderr","time":"2019-02-25T15:39:55.071748318Z"}
{"log":"2019-02-25 15:39:55.071 ERROR: failed to open gcomm backend connection: 131: No address to connect (FATAL)\n","stream":"stderr","time":"2019-02-25T15:39:55.071806276Z"}
{"log":"\u0009 at gcomm/src/gmcast.cpp:connect_precheck():310\n","stream":"stderr","time":"2019-02-25T15:39:55.071811556Z"}
{"log":"2019-02-25 15:39:55.071 ERROR: gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend connection: -131 (State not recoverable)\n","stream":"stderr","time":"2019-02-25T15:39:55.071813648Z"}
{"log":"2019-02-25 15:39:55.071 ERROR: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel 'mariadb-server_openstack' at 'gcomm://mariadb-server-0.mariadb-discovery.openstack.svc.cluster.local,mariadb-server-1.mariadb-discovery.openstack.svc.cluster.local&#39;: -131 (State not recoverable)\n","stream":"stderr","time":"2019-02-25T15:39:55.071867156Z"}
{"log":"2019-02-25 15:39:55.071 FATAL: Exception in creating receive loop: Failed to open connection to group: 131 (State not recoverable)\n","stream":"stderr","time":"2019-02-25T15:39:55.071884553Z"}
{"log":"\u0009 at garb/garb_gcs.cpp:Gcs():35\n","stream":"stderr","time":"2019-02-25T15:39:55.071888396Z"}

Frank Miller (sensfan22)
tags: added: stx.containers
Changed in starlingx:
importance: Undecided → High
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; high priority as containers do not come up in standard and storage configs. Recent commits maybe suspect.

Changed in starlingx:
assignee: nobody → Chris Friesen (cbf123)
status: New → Triaged
tags: added: stx.2019.05
Frank Miller (sensfan22)
summary: - Containers: garbd not coming up on freshly installed system
+ Containers: cluster IP address was not assigned on some computes after
+ lock/unlock
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: Chris Friesen (cbf123) → Teresa Ho (teresaho)
summary: - Containers: cluster IP address was not assigned on some computes after
- lock/unlock
+ Containers: cluster IP address not assigned on computes after
+ lock/unlock - - I/F name too long
summary: Containers: cluster IP address not assigned on computes after
- lock/unlock - - I/F name too long
+ lock/unlock - I/F name too long
Revision history for this message
Ghada Khalil (gkhalil) wrote :
Download full text (3.7 KiB)

Changing the bug summary as further investigation shows that the issue is related to assigning the cluster network IP address.

Details from Don Penney:
------------------------
The cluster IP address was not assigned on compute-1 and compute-2, but was assigned on compute-0. When kubernetes brings up its node, it sees the mgmt IP only, which cannot reach the 172.16.1.11 address. We'd expect to see the 192.168.206.X address here instead:

[wrsroot@controller-0 ~(keystone_admin)]$ kubectl describe node compute-1
Name: compute-1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=compute-1
                    openstack-compute-node=enabled
                    openvswitch=enabled
                    sriov=enabled
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 192.168.204.251/24
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 25 Feb 2019 04:13:11 +0000

So it seems that the root cause here is the failure to assign the cluster IP, which appears to be due to the interface name being too long in this lab when suffixed with a VLAN tag and an alias identifier (ie. enp134s0f0.169:5, enp136s0f0.169:5)

It looks like there are a couple of issues exposed here:
1. kernel defines IFNAMSIZ to be 16 bytes, so max interface name length is 15 characters. In the case of the aliased cluster IP address on a VLAN, the full interface name is IFNAME.VLAN:ALIAS, ie. enp134s0f0.169:5, which is 16 characters. This results in "RTNETLINK answers: Numerical result out of range" errors that block the aliased address being assigned with "ifup IFNAME.VLAN"
2. apply_network_config.sh script is doing ifdown/ifup for changed interfaces, but is not handling aliased files or sorting interfaces. If the "find" run by the script returns the base VLAN interface filename first, it will do an ifup on the IFNAME.VLAN:ALIAS name after, which we've seen from a manual test will give the RTNETLINK error but assign the IP address.

In this lab, we saw that on the initial unlock of each compute, the apply_network_config.sh tried to ifup the aliased names ahead of the base VLAN name on compute-1 and compute-2, which fails because the underlying VLAN interface hadn't been setup yet. As a result, the cluster IP as never assigned on those hosts. On compute-0, it did the base VLAN interface first, and was able to then assign the cluster IP address from the aliased filename.

On a reboot, the /etc/init.d/network script brings up the interfaces, but does not have a separate ifup for the aliased name. So we see it attempt to assign the cluster IP, but fail a check and cannot assign the address. Subsequent reboots of compute-0 result in the cluster IP address not being assigned as a result (the apply_network_config.sh script only does the ifdown/ifup if there's a config change). The following logs are seen in daemon.log:

2019-02-25T22:17:48.255 compute-0 network[1875]: info Determining IP information for...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/639683

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
Frank Miller (sensfan22) wrote :

Lowering priority to medium. Analysis indicates this only impacts 1 lab and can be worked around by changing a vlan name.

Changed in starlingx:
importance: High → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (master)

Reviewed: https://review.openstack.org/639683
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=575be13d684eb4763092bd391f41d168341dbe1f
Submitter: Zuul
Branch: master

commit 575be13d684eb4763092bd391f41d168341dbe1f
Author: Teresa Ho <email address hidden>
Date: Wed Feb 27 09:32:05 2019 -0500

    Fix to not restart alias interface

    The alias interface may at times be brought up before its parent
    interface (ethernet or vlan) depending on how the interface config
    files are listed in the filesystem.
    The fix is to not restart the alias interface. If there is a
    change in the alias interface configuration, the parent interface
    would be restarted instead.

    Partial-Bug: 1817593

    Change-Id: I83b88e1587c6468d85e14a50de934908586cbc9b
    Signed-off-by: Teresa Ho <email address hidden>

Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Ghada Khalil (gkhalil)
tags: added: stx.retestneeded
Ghada Khalil (gkhalil)
tags: added: stx.networking
removed: stx.containers
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Based on input from Matt Peters (networking TL), we should investigate an alternate naming scheme for vlan interfaces (e.g. vlanN) so that the vlanN:alias will not exceed the 15 char limit.

Revision history for this message
Forrest Zhao (forrest.zhao) wrote :

Can be worked around by explicitly defining the cluster network. Defer to stx 3.0 for final fix.

tags: removed: stx.2.0
Frank Miller (sensfan22)
tags: added: stx.3.0
Ghada Khalil (gkhalil)
Changed in starlingx:
status: In Progress → Confirmed
Revision history for this message
Matt Peters (mpeters-wrs) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/686728

Changed in starlingx:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to metal (master)

Fix proposed to branch: master
Review: https://review.opendev.org/686834

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/686728
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=c823a241f9bf695883ac97c92681478986d1da55
Submitter: Zuul
Branch: master

commit c823a241f9bf695883ac97c92681478986d1da55
Author: Teresa Ho <email address hidden>
Date: Fri Oct 4 09:48:50 2019 -0400

    Change VLAN interface naming scheme

    The VLAN alias interface name may exceed the maximum interface
    name length as defined by the kernel, depending on some hardware.
    This commit changes the vlan naming convention from device name plus
    VLAN ID to VLAN plus VLAN ID for platform interfaces.
    A semantic check is also added to ensure that VLAN IDs are unique
    across all platform interfaces of a host.

    Closes-Bug: 1817593

    Change-Id: Idf732e85aa7a6d19491f13e29cc1b408bb5bfe5d
    Signed-off-by: Teresa Ho <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (master)

Reviewed: https://review.opendev.org/686834
Committed: https://git.openstack.org/cgit/starlingx/metal/commit/?id=d6ea5446e9d008a065a12cecc6c6a4692aa2c3e9
Submitter: Zuul
Branch: master

commit d6ea5446e9d008a065a12cecc6c6a4692aa2c3e9
Author: Teresa Ho <email address hidden>
Date: Fri Oct 4 16:09:23 2019 -0400

    Change VLAN interface naming scheme

    The VLAN alias interface name may exceed the maximum interface
    name length as defined by the kernel, depending on some hardware.
    This commit changes the vlan naming convention from device name plus
    VLAN ID to VLAN plus VLAN ID.

    Closes-Bug: 1817593
    Depends-On: https://review.opendev.org/#/c/686728/

    Change-Id: I8a74e1d47e0ab3ef261f9512a8887d7f0de66064
    Signed-off-by: Teresa Ho <email address hidden>

Revision history for this message
Yang Liu (yliu12) wrote :

Verified on 1102 load. IF naming convention is changed as described.

tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.