compute node hangs on unlock due to ovs-vswitchd memory initialization error.

Bug #1829390 reported by Allain Legacy
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
ChenjieXu

Bug Description

Brief Description
-----------------
A worker node is failing to unlock because there is no memory allocated/reserved for vswitch use. This system has 4 worker nodes. All 4 are configured identically, but a single node is failing to unlock because ovs-vswitchd is failing to start. Over ~6 different installations this has happened twice on initial unlock.

Severity
--------
Major.

Steps to Reproduce
------------------
Install controller-0, configure it with Ansible, install, configure and unlock remaining nodes. Observe that some worker nodes may not go unlocked/enabled and instead just hang during their post-unlock initialization.

Expected Behavior
------------------
Nodes should unlock without issue.

Actual Behavior
----------------
A worker node is hung during initial post-unlock initialization.

Reproducibility
---------------
30-50%

System Configuration
--------------------
2+4

Branch/Pull Time/Commit
-----------------------
Private load rebased on May 10th.

Last Pass
---------
Passes occassionally on this load.

Timestamp/Logs
--------------

Comparing the system memory configuration between a good (compute-2) and bad (compute-3) node there is a clear discrepancy between the total memory available on both nodes.

[wrsroot@controller-0 ~(keystone_admin)]$ system host-memory-list compute-3
+-----------+---------+------------+---------+----------------+--------+--------+--------+-------+---------+---------------+----------------+------------------+----------------+----------------+------------------+--------------+
| processor | mem_tot | mem_platfo | mem_ava | hugepages(hp)_ | vs_hp_ | vs_hp_ | vs_hp_ | vs_hp | vm_tota | vm_hp_total_2 | vm_hp_avail_2M | vm_hp_pending_2M | vm_hp_total_1G | vm_hp_avail_1G | vm_hp_pending_1G | vm_hp_use_1G |
| | al(MiB) | rm(MiB) | il(MiB) | configured | size(M | total | avail | _reqd | l_4K | M | | | | | | |
| | | | | | iB) | | | | | | | | | | | |
+-----------+---------+------------+---------+----------------+--------+--------+--------+-------+---------+---------------+----------------+------------------+----------------+----------------+------------------+--------------+
| 0 | 1024 | 8000 | 1024 | True | 1024 | 0 | 0 | None | 0 | 0 | 0 | None | 1 | 1 | None | True |
| 1 | 4408 | 2000 | 4408 | True | 1024 | 0 | 0 | None | 866304 | 0 | 0 | None | 1 | 1 | None | True |
+-----------+---------+------------+---------+----------------+--------+--------+--------+-------+---------+---------------+----------------+------------------+----------------+----------------+------------------+--------------+
[wrsroot@controller-0 ~(keystone_admin)]$ system host-memory-list compute-2
+-----------+---------+------------+---------+----------------+--------+--------+--------+-------+---------+--------------+----------------+------------------+----------------+----------------+------------------+--------------+
| processor | mem_tot | mem_platfo | mem_ava | hugepages(hp)_ | vs_hp_ | vs_hp_ | vs_hp_ | vs_hp | vm_tota | vm_hp_total_ | vm_hp_avail_2M | vm_hp_pending_2M | vm_hp_total_1G | vm_hp_avail_1G | vm_hp_pending_1G | vm_hp_use_1G |
| | al(MiB) | rm(MiB) | il(MiB) | configured | size(M | total | avail | _reqd | l_4K | 2M | | | | | | |
| | | | | | iB) | | | | | | | | | | | |
+-----------+---------+------------+---------+----------------+--------+--------+--------+-------+---------+--------------+----------------+------------------+----------------+----------------+------------------+--------------+
| 0 | 58316 | 8000 | 57292 | True | 1024 | 1 | 0 | None | 0 | 28646 | 28646 | None | 0 | 0 | None | True |
| 1 | 62064 | 2000 | 61040 | True | 1024 | 1 | 0 | None | 865894 | 28829 | 28829 | None | 0 | 0 | None | True |
+-----------+---------+------------+---------+----------------+--------+--------+--------+-------+---------+--------------+----------------+------------------+----------------+----------------+------------------+--------------+

The local information on the node does not seem to agree with the system inventory data:

compute-3:~$ free -g
              total used free shared buff/cache available
Mem: 125 1 123 0 0 123
Swap: 0 0 0
compute-3:~$ sudo cat /proc/meminfo
Password:
MemTotal: 131810660 kB
MemFree: 129926964 kB
MemAvailable: 129766652 kB
Buffers: 36456 kB
Cached: 414384 kB
SwapCached: 0 kB
Active: 489440 kB
Inactive: 227012 kB
Active(anon): 271960 kB
Inactive(anon): 8348 kB
Active(file): 217480 kB
Inactive(file): 218664 kB
Unevictable: 5424 kB
Mlocked: 5424 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 148 kB
Writeback: 0 kB
AnonPages: 271020 kB
Mapped: 63744 kB
Shmem: 10692 kB
Slab: 149848 kB
SReclaimable: 59264 kB
SUnreclaim: 90584 kB
KernelStack: 10928 kB
PageTables: 8144 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 65905328 kB
Committed_AS: 740776 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 574016 kB
VmallocChunk: 34291828732 kB
HardwareCorrupted: 0 kB
CmaTotal: 16384 kB
CmaFree: 9216 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 17036 kB
DirectMap2M: 3028992 kB
DirectMap1G: 133169152 kB

compute-3:~$ sudo find /sys -name "nr_huge*"
/sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
/sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
/sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages_mempolicy
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages_mempolicy
compute-3:~$ sudo find /sys -name "nr_huge*" | xargs -L1 grep -E "^"
0
0
0
0
0
0
0
0

compute-3:~$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.10.0-957.1.3.el7.1.tis.x86_64 root=UUID=e0a33b53-a3de-480f-a27e-4e4c9427a65c ro security_profile=standard module_blacklist=integrity,ima audit=0 tboot=false crashkernel=auto biosdevname=0 console=ttyS0,115200n8 iommu=pt usbcore.autosuspend=-1 selinux=0 enforcing=0 nmi_watchdog=panic,1 softlockup_panic=1 intel_iommu=on user_namespace.enable=1 hugepagesz=1G hugepages=2 hugepagesz=2M hugepages=0 default_hugepagesz=2M irqaffinity=0 rcu_nocbs=1-35 isolcpus=1-2 kthread_cpus=0 nopti nospectre_v2

The end result is that the default hiera data for the node configures 0 memory for vswitch use (192.168.144.28 is compute-3):

[wrsroot@controller-0 ~(keystone_admin)]$ sudo grep -rn socket_mem /opt
Password:
/opt/platform/puppet/19.01/hieradata/192.168.144.28.yaml:291:vswitch::dpdk::socket_mem: '0,0'
/opt/platform/puppet/19.01/hieradata/192.168.144.221.yaml:292:vswitch::dpdk::socket_mem: '1024,1024'
/opt/platform/puppet/19.01/hieradata/192.168.144.52.yaml:292:vswitch::dpdk::socket_mem: '1024,1024'
/opt/platform/puppet/19.01/hieradata/192.168.144.57.yaml:287:vswitch::dpdk::socket_mem: '1024,1024'

2019-05-16T12:36:47.565Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log
2019-05-16T12:36:47.581Z|00002|ovs_numa|INFO|Discovered 18 CPU cores on NUMA node 0
2019-05-16T12:36:47.581Z|00003|ovs_numa|INFO|Discovered 18 CPU cores on NUMA node 1
2019-05-16T12:36:47.581Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes and 36 CPU cores
2019-05-16T12:36:47.581Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2019-05-16T12:36:47.581Z|00006|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2019-05-16T12:36:47.582Z|00007|dpdk|INFO|DPDK Disabled - Use other_config:dpdk-init to enable
2019-05-16T12:36:47.583Z|00008|dpif_netlink|INFO|The kernel module does not support meters.
2019-05-16T12:36:47.588Z|00009|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.11.0
2019-05-16T12:36:47.892Z|00010|dpdk|INFO|Using DPDK 18.11.0
2019-05-16T12:36:47.892Z|00011|dpdk|INFO|DPDK Enabled - initializing...
2019-05-16T12:36:47.892Z|00012|dpdk|INFO|No vhost-sock-dir provided - defaulting to /var/run/openvswitch
2019-05-16T12:36:47.892Z|00013|dpdk|INFO|IOMMU support for vhost-user-client disabled.
2019-05-16T12:36:47.892Z|00014|dpdk|INFO|Per port memory for DPDK devices disabled.
2019-05-16T12:36:47.892Z|00015|dpdk|INFO|EAL ARGS: ovs-vswitchd -n 4 -c 7 --huge-dir /mnt/huge-1048576kB --socket-mem 0,0 --socket-limit 0,0.
2019-05-16T12:36:47.894Z|00016|dpdk|INFO|EAL: Detected 36 lcore(s)
2019-05-16T12:36:47.894Z|00017|dpdk|INFO|EAL: Detected 2 NUMA nodes
2019-05-16T12:36:47.894Z|00018|dpdk|ERR|EAL: invalid parameters for --socket-mem
2019-05-16T12:36:47.894Z|00019|dpdk|ERR|EAL: Invalid 'command line' arguments.
2019-05-16T12:36:47.894Z|00020|dpdk|EMER|Unable to initialize DPDK: Invalid argument

Test Activity
-------------
Developer testing

Revision history for this message
Ghada Khalil (gkhalil) wrote :

The same issue is also reported in: https://bugs.launchpad.net/starlingx/+bug/1829403

tags: added: stx.networking
Changed in starlingx:
assignee: nobody → Forrest Zhao (forrest.zhao)
Changed in starlingx:
assignee: Forrest Zhao (forrest.zhao) → ChenjieXu (midone)
Revision history for this message
ChenjieXu (midone) wrote :

Hi Allain,

Does this bug happens at the same machine over 6 different installations?

According to your logs, no hugepage is allocated but OVS-DPDK needs hugepages to start. Could you please run following command to check the available hugepages?
   sudo find /sys -name "free_hugepages*"
   sudo find /sys -name "free_hugepages" | xargs -L1 grep -E "^"
   mount | grep -i huge

Could you please try to allocate hugepage by following command and then restart ovs-vswitchd again?
1. allocate hugepages on each numa node:
   sudo echo 3 > sudo tee /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
   sudo echo 3 > sudo tee /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
   sudo echo 5000 > sudo tee /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
   sudo echo 5000 > sudo tee /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
2. make sure hugepages have been allocated by checking free_hugepages:
   sudo cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/free_hugepages
   sudo cat /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/free_hugepages
   sudo cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/free_hugepages
   sudo cat /sys/devices/system/node/node1/hugepages/hugepages-2048kB/free_hugepages
3. Make sure ovsdb-server is running
   systemctl status ovsdb-server
4. restart ovs-vswitchd
   systemctl status ovs-vswitchd
   sudo systemctl restart ovs-vswitchd
   systemctl status ovs-vswitchd
5. please attach following logs:
   /var/log/openvswitch/ovsdb-server.log
   /var/log/openvswitch/ovs-vswitchd.log

Changed in starlingx:
status: New → Incomplete
Revision history for this message
Allain Legacy (alegacy) wrote :

I no longer have access to the system that this was occurring on. I do not recall if it always occurred on the same node or different nodes. If you examine the system host-memory-list output that I provided the system inventory reflects 0 hugepages reserved for vswitch use, and based on the output from /proc/meminfo there are no allocated or reserved 2MB hugepages either.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; issue with memory allocation for ovs-dpdk. High priority as issue has been seen multiple times.

Changed in starlingx:
importance: Undecided → High
tags: added: stx.2.0
Revision history for this message
ChenjieXu (midone) wrote :

Hi Allain,

This bug is the same bug as https://bugs.launchpad.net/starlingx/+bug/1829403. Both bugs arise because there is no hugepage available but OVS-DPDK needs hugepage to start. According to Peng, this bug has already be seen on different machines.

Peng has an environment which we can use to debug. What's more, I'm trying to reproduce this bug.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as duplicate; we'll track the issue under https://bugs.launchpad.net/starlingx/+bug/1829403

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Duplicate bug is fixed by:
https://review.opendev.org/672634
Merged on 2019-07-29

Changed in starlingx:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.