VM can't send packet through vhostuser port due to missing numa settings in domain xml

Bug #1820378 reported by ChenjieXu on 2019-03-16
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
High
Chris Friesen

Bug Description

Title
-----
VM can't send packet through vhostuser port

Brief Description
-----------------
VM can't send packet through vhostuser port. During the booting up VM, VM will contact DHCP server to get an IP address. However packet can't go through vhostuser port, VM can't get the IP address.

Severity
--------
Critical

Steps to Reproduce
------------------
1. Install Standard 2+2 on 4 bare metals
   Guide: https://wiki.openstack.org/wiki/StarlingX/Containers/Installation
2. Using the scripts from the guide to create network, subnet and router.
3. Download cirros image on controller-0
   wget http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
4. Upload cirros image
   openstack image create --container-format bare --disk-format qcow2 --file cirros-0.3.4-x86_64-disk.img cirros
5. Create VM
   openstack server create --image cirros --flavor m1.tiny --network public-net0 vm1
6. Check which host vm1 is running
   openstack server show vm1
7. On the host where vm1 is running, log in the vm1 and check the IP
   sudo virsh list
   sudo virsh console $num_vm1
   login vm1: cirros, cubswin:)
   sudo ifconfig

Expected Behavior
------------------
An IP address has been assigned to interface eth0

Actual Behavior
----------------
No IP address assigned to interface eth0

System Configuration
--------------------
System mode: Standard 2+2 on bare metals

Reproducibility
---------------
100%

Branch/Pull Time/Commit
-----------------------
master as of 20190305T060000Z

Timestamp/Logs
--------------
compute-0:/home/wrsroot# virsh list
 Id Name State
-----------------------------------
 18 instance-00000037 running
 21 cirros-ovs-dpdk running
 22 instance-0000003d running

compute-0:/home/wrsroot# virsh console 22
Connected to domain instance-0000003d
Escape character is ^]
Sending discover...
Sending discover...
Usage: /sbin/cirros-dhcpc <up|down>
No lease, failing
WARN: /etc/rc3.d/S40-network failed
cirros-ds 'net' up at 181.29
checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 181.30. request failed
failed 2/20: up 183.45. request failed
failed 3/20: up 185.45. request failed
failed 4/20: up 187.46. request failed
failed 5/20: up 189.46. request failed
failed 6/20: up 191.47. request failed
failed 7/20: up 193.47. request failed
failed 8/20: up 195.48. request failed
failed 9/20: up 197.48. request failed
failed 10/20: up 199.49. request failed
failed 11/20: up 201.49. request failed
failed 12/20: up 203.49. request failed
failed 13/20: up 205.50. request failed
failed 14/20: up 207.50. request failed
failed 15/20: up 209.51. request failed
failed 16/20: up 211.51. request failed
failed 17/20: up 213.52. request failed
failed 18/20: up 215.52. request failed
failed 19/20: up 217.53. request failed
failed 20/20: up 219.53. request failed
failed to read iid from metadata. tried 20
no results found for mode=net. up 221.54. searched: nocloud configdrive ec2
failed to get instance-id of datasource
Starting dropbear sshd: generating rsa key... generating dsa key... OK
=== system information ===
Platform: OpenStack Foundation OpenStack Nova
Container: none
Arch: x86_64
CPU(s): 1 @ 2693.508 MHz
Cores/Sockets/Threads: 1/1/1
Virt-type:
RAM Size: 491MB
Disks:
NAME MAJ:MIN SIZE LABEL MOUNTPOINT
vda 253:0 1073741824
vda1 253:1 1061061120 cirros-rootfs /
=== sshd host keys ===
-----BEGIN SSH HOST KEY KEYS-----
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgwCA7QbmxFnKoleSsaEAIUAio+d3TkrpQMW3pe16n+LhQQRHxnrFzKBzkkeaDD2xEpn49OjwheWJHaxUq6pDQ5pedneLb7MJR/aduum8mGFp3jVKrK+z7KX9iUp3m2oNw0ijnUzxME4nT17xQ99XZZ90rk0+M5Kp7H/JqKpYjZEq85lf root@cirros
ssh-dss AAAAB3NzaC1kc3MAAACBAJb8vo+8XFfUaBqaAXZF5ENkHtFZIOcQYq8TbQm0HXvWnuHd6Ur2MX4opaUnk6ar9t/evjZJfq8A750q6NvOC6zCiMzhp6huqBpCfrUNMHVkdGtHEDtJRbFhLJbjL/57lVwxH+UukWKjtuPX7BTdP0NgT4VEbvGo0nfqGycbxXnnAAAAFQCy7dP1nLBU/jMIQ64OH7gvVcQ91QAAAIBMHGh9cfXdGRdwXuyA8JmzgYIzDzm/L96+RYl8ARJs9u7pp8ZV4kt2zF6zHcy+i3slXv82BoKv5G0mBgFypZE1kUonNn4U33ecYt7xALI5mzZinCk/sZd0dYkAwvogFQuXQHfYiozXNZtousZRB6x24oBgAPteM7Q/c9ckVOhtOQAAAIAVCrPaBucLRNeG7tst2OLGXwFacFq3M4dD/i5Tf9xJJCW26K3ZCoc/UkkdZv5TFWtXnDGCtcl+0t5JAanbZ90MosKvfo/QZEBEGXczh54evSoidgUKXwXBpGOp6jS2mOeIwvsTcMVdhuFi2/k5otbvoR6f/Rk2y3CiVHIo3nUkXw== root@cirros
-----END SSH HOST KEY KEYS-----
=== network info ===
if-info: lo,up,127.0.0.1,8,::1
if-info: eth0,up,,8,fe80::f816:3eff:fe54:33ca
=== datasource: None None ===
=== cirros: current=0.3.4 uptime=221.75 ===
route: fscanf
=== pinging gateway failed, debugging connection ===
############ debug start ##############
### /etc/init.d/sshd start
Starting dropbear sshd: OK
route: fscanf
### ifconfig -a
eth0 Link encap:Ethernet HWaddr FA:16:3E:54:33:CA
          inet6 addr: fe80::f816:3eff:fe54:33ca/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B) TX bytes:1132 (1.1 KiB)

lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:12 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1020 (1020.0 B) TX bytes:1020 (1020.0 B)

### route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
route: fscanf
### cat /etc/resolv.conf
cat: can't open '/etc/resolv.conf': No such file or directory
### gateway not found
/sbin/cirros-status: line 1: can't open /etc/resolv.conf: no such file
### pinging nameservers
### uname -a
Linux cirros 3.2.0-80-virtual #116-Ubuntu SMP Mon Mar 23 17:28:52 UTC 2015 x86_64 GNU/Linux
### lsmod
Module Size Used by Not tainted
nls_iso8859_1 12713 0
nls_cp437 16991 0
vfat 17585 0
fat 61512 1 vfat
isofs 40259 0
ip_tables 27473 0
x_tables 29891 1 ip_tables
pcnet32 42119 0
8139cp 27360 0
ne2k_pci 13691 0
8390 18856 1 ne2k_pci
e1000 108589 0
acpiphp 24231 0
### dmesg | tail
[ 0.970394] acpiphp: Slot [30] registered
[ 0.970405] acpiphp: Slot [31] registered
[ 0.979866] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[ 0.979868] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 0.988386] ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker
[ 0.994725] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
[ 0.999723] pcnet32: pcnet32.c:v1.35 21.Apr.2008 <email address hidden>
[ 1.006790] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 1.364080] Refined TSC clocksource calibration: 2693.563 MHz.
[ 11.344086] eth0: no IPv6 routers present
### tail -n 25 /var/log/messages
Mar 16 17:47:28 cirros kern.info kernel: [ 0.000000] Centaur CentaurHauls
Mar 16 17:47:28 cirros kern.info kernel: [ 0.000000] BIOS-provided physical RAM map:
Mar 16 17:47:28 cirros kern.info kernel: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
Mar 16 17:47:28 cirros kern.info kernel: [ 0.900022] usb 1-1: new full-speed USB device number 2 using uhci_hcd
Mar 16 17:47:28 cirros kern.info kernel: [ 0.927882] EXT3-fs (vda1): using internal journal
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970055] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970116] acpiphp: Slot [3] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970127] acpiphp: Slot [4] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970138] acpiphp: Slot [5] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970148] acpiphp: Slot [6] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970158] acpiphp: Slot [7] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970168] acpiphp: Slot [8] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970178] acpiphp: Slot [9] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970188] acpiphp: Slot [10] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970198] acpiphp: Slot [11] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.970209] acpiphp: Slot [12] registered
Mar 16 17:47:28 cirros kern.info kernel: [ 0.979866] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
Mar 16 17:47:28 cirros kern.info kernel: [ 0.979868] e1000: Copyright (c) 1999-2006 Intel Corporation.
Mar 16 17:47:28 cirros kern.info kernel: [ 0.988386] ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker
Mar 16 17:47:28 cirros kern.info kernel: [ 0.994725] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
Mar 16 17:47:28 cirros kern.info kernel: [ 0.999723] pcnet32: pcnet32.c:v1.35 21.Apr.2008 <email address hidden>
Mar 16 17:47:28 cirros kern.info kernel: [ 1.006790] ip_tables: (C) 2000-2006 Netfilter Core Team
Mar 16 17:47:29 cirros kern.info kernel: [ 1.364080] Refined TSC clocksource calibration: 2693.563 MHz.
Mar 16 17:47:39 cirros kern.debug kernel: [ 11.344086] eth0: no IPv6 routers present
Mar 16 17:51:09 cirros authpriv.info dropbear[301]: Running in background
############ debug end ##############
  ____ ____ ____
 / __/ __ ____ ____ / __ \/ __/
/ /__ / // __// __// /_/ /\ \
\___//_//_/ /_/ \____/___/
   http://cirros-cloud.net

login as 'cirros' user. default password: 'cubswin:)'. use 'sudo' for root.
cirros login: cirros
Password:
$ ifconfig
eth0 Link encap:Ethernet HWaddr FA:16:3E:54:33:CA
          inet6 addr: fe80::f816:3eff:fe54:33ca/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B) TX bytes:1132 (1.1 KiB)

lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:12 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1020 (1020.0 B) TX bytes:1020 (1020.0 B)

$

ChenjieXu (midone) wrote :

We manually assigned correct IP to the vm1 and ping DHCP server in vm1. Ping fails and no packet can be captured by tcpdump.

1. Check the IP assigned to the VM:
   openstack server show vm1
2. Manually assign IP to VM:
   sudo ifconfig 192.168.101.108/24 up
3. Check DHCP namespace exist and DHCP server address
   sudo ip netns
   sudo ip netns exec $dhcp_namespace ifconfig
3. tcpdump in dhcp namespace:
   sudo ip netns exec $dhcp_namespace tcpdump –i $tap_device
4. Ping DHCP server in VM:
   ping 192.168.101.2

Based on above analysis, the vhostuser port doesn't work.

ChenjieXu (midone) wrote :

We added our own vhostuser port to br-int and use virsh command and cirros-dpdk-vhostuserclient.xml to create VM. The manually created vhostuser port can work.

1. Create vhostuser port:
   ovs-vsctl add-port br-int vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuserclient
      options:vhost-server-path=/var/run/openvswitch/vhost-user-1
2. Download image on controller-0 and scp image to compute-0
   wget http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
   scp cirros-0.3.4-x86_64-disk.img wrsroot@compute-0:~/
3. Copy image to directory /var/lib/nova/instances for creating VM
   cp cirros-0.3.4-x86_64-disk.img /var/lib/nova/instances/
4. Bring up br-int to verify the connectivity:
   sudo ifconfig br-int 192.168.101.1/24 up
2. Create VM:
   virsh create cirros-dpdk-vhostuserclient.xml
4. Manually assign an IP to VM:
   sudo ifconfig eth0 192.168.101.18/24 up
5. In created VM, ping br-int:
   ping 192.168.101.1

The ping succeeds.

ChenjieXu (midone) wrote :

Need Nova expert to investigate this bug. The automatically created vhostuser port is created by stx-nova.

The OVS-DPDK should be fine because manually created vhostuser port can work.

ChenjieXu (midone) wrote :
ChenjieXu (midone) wrote :
ChenjieXu (midone) wrote :

Based on our further investigation, it's domain XML file (e.g. cirros-dpdk-vhostuserclient.xml, starlingx_created_vm.xml) causing this bug. The domain XML lacks numa related sections. The vhostuser port created by StarlingX should be fine.

Need Nova expert to investigate this bug. Because domain XML file is created by stx-nova.

ChenjieXu (midone) wrote :

Our debugging steps:
1. Create vm1 on controller-0
   export OS_CLOUD=openstack_helm
   wget http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
   openstack image create --container-format bare --disk-format qcow2 --file cirros-0.3.4-x86_64-disk.img cirros
   openstack server create --image cirros --flavor m1.tiny --network public-net0 vm1
2. Dump xml file on compute-0 (Assume vm1 runs on compute-0)
   sudo bash
   virsh list
   virsh dumpxml $num_vm1 > vm1.xml
3. Add numa realted section to vm1.xml
   Add numatune after <cputune>~~~</cputune>
      <numatune>
       <memnode cellid='0' mode='strict' nodeset='0'/>
      </numatune>
   Add numa in <cpu></cpu> like following:
      <cpu>
       <topology sockets='1' cores='1' threads='1'/>
       <numa>
         <cell id='0' cpus='0' memory='524288' unit='KiB' memAccess='shared'/>
       </numa>
      </cpu>
4. Delete and change some sections. If not, the manually created VM will be deleted by StarlingX.
   uuid and metadata sections should be delted:
      <uuid>c57a4100-b329-4e81-ab1d-9d68c9f6d67b</uuid>
      <metadata>
       <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
         <nova:package version="18.1.0"/>
         <nova:name>vm1</nova:name>
         <nova:creationTime>2019-03-18 14:44:22</nova:creationTime>
         <nova:flavor name="m1.tiny">
           <nova:memory>512</nova:memory>
           <nova:disk>1</nova:disk>
           <nova:swap>0</nova:swap>
           <nova:ephemeral>0</nova:ephemeral>
           <nova:vcpus>1</nova:vcpus>
         </nova:flavor>
         <nova:owner>
           <nova:user uuid="bce892a00893431c867e4c927818fc6a">admin</nova:user>
           <nova:project uuid="6ca2bc8fbc3c4cb6abd64c623eafd5c4">admin</nova:project>
         </nova:owner>
         <nova:root type="image" uuid="b8e9c426-3480-48f2-afbe-5fc2ab592095"/>
       </nova:instance>
     </metadata>

   Change the name in name section
      <nova:name>instance-00000001</nova:name>

   Change the uuid in sysinfo section
      <entry name='serial'>79911507-76fd-43b0-b8b1-7696f78dbbe5</entry>
      <entry name='uuid'>79911507-76fd-43b0-b8b1-7696f78dbbe5</entry>
5. Create vm1 using the vm1.xml
   virsh create vm1.xml

ChenjieXu (midone) wrote :
ChenjieXu (midone) on 2019-03-18
summary: - vhostuser port on br-int doesn't work
+ Lacking numa related sections in domain xml files
ChenjieXu (midone) on 2019-03-18
description: updated
summary: - Lacking numa related sections in domain xml files
+ VM can't send packet through vhostuser port

The numa related sections are present in the Wind River test labs, please see attached for full XML of a running instance.

OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190316T013000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="21"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-03-16 01:30:00 +0000"

Matt Peters (mpeters-wrs) wrote :
Ghada Khalil (gkhalil) on 2019-03-18
tags: added: stx.networking
Matt Peters (mpeters-wrs) wrote :

Can you confirm that NUMA is enabled in the BIOS for the test system?

ChenjieXu (midone) wrote :

Hi Matt,

I checked in BIOS and I only find "NUMA Optimized Enabled".

Juan Pablo Gomez (jpgomez) wrote :

This issue was also reproduced in Duplex bare metal

Ricardo Perez (richomx) wrote :

This issue is reproducible using 2+2 Bare Metal.

Despite the Horizon shows a VM with an IP assigned. If you open the VM console, you will see that eth0 has no IP assigned.

Even do, if you assign it manually and try the ping, it doesn't work either.

Ghada Khalil (gkhalil) wrote :

What is the server model(s) used? Are these wolfpass servers with skylake processors?

Ghada Khalil (gkhalil) on 2019-04-01
summary: - VM can't send packet through vhostuser port
+ VM can't send packet through vhostuser port due to missing numa settings
+ in VM xml
summary: VM can't send packet through vhostuser port due to missing numa settings
- in VM xml
+ in domain xml
Chris Winnicki (chriswinnicki) wrote :

For clarification and completeness; at Wind River wolfpass Bios Memory settings have "NUMA Optimized" enabled - see below:

Enter Setup
 > Advanced
  > Memory Configuration
   > Memory RAS and Performance Configuration
      NUMA Optimized <Enabled>

 If enabled, the BIOS
 includes ACPI tables
 that are required for
 NUMA-aware Operating
 Systems.

ChenjieXu (midone) wrote :

Hi Ghada,

Compute-0 and compute-1’s server model both are:
Manufacturer: Intel Corporation
       Product Name: S2600JF

Compute-0 and compute-1 don’t have skylake processors. The processors both are:
Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz

Ghada Khalil (gkhalil) on 2019-04-02
Changed in starlingx:
importance: Undecided → High
Chris Friesen (cbf123) wrote :

Just to clarify, guest instances will not have a numa topology by default. They will only get a numa topology if something is specifically requested that requires one (hugepages, cpu pinning, pci devices, etc.)

With the default m1.tiny flavor there is nothing that explicitly requests a numa topology, and so I'm not surprised that the instance doesn't have one.

In the WindRiver labs we generally default to specifying hugepages in the flavor extra-specs and so that's why our instance XML had the numa topology.

From talking with Matt it's not the fact that the instance doesn't have a numa topology that is the source of the problem, but rather the fact that it doesn't have hugepages since apparently OVS-DPDK requires hugepage-backed memory.

Matt Peters (mpeters-wrs) wrote :

The guests must be configured to use a flavor that has the property hw:mem_page_size=large set.

You can follow this link to read more about the requirements on the guests for OVS-DPDK:
https://docs.openstack.org/neutron/rocky/admin/config-ovs-dpdk.html

Excerpt:
“vhost-user requires file descriptor-backed shared memory. Currently, the only way to request this is by requesting large pages. This is why instances spawned on hosts with OVS-DPDK must request large pages”.

Ghada Khalil (gkhalil) on 2019-04-04
Changed in starlingx:
assignee: nobody → Chris Friesen (cbf123)
Ghada Khalil (gkhalil) wrote :

Waiting for Chenjie to retest with a different VM flavor

Ghada Khalil (gkhalil) on 2019-04-05
Changed in starlingx:
status: New → In Progress
Ghada Khalil (gkhalil) on 2019-04-05
tags: added: stx.2.0
ChenjieXu (midone) wrote :

Hi all,

After setting property "hw:mem_page_size=large" to flavor, the newly created VM can get IP from DHCP and ping other VM successfully. And NUMA related sections exist in the domain XML file (new domain XML mem_page_size.xml is attached). My steps are list below:

1. On the active controller:
   export OS_CLOUD=openstack_helm
   openstack flavor create --ram 512 --disk 1 --vcpus 1 my_tiny
   openstack flavor list
   openstack flavor set $UUID_my_tiny --property hw:mem_page_size=large
   openstack server create --image cirros --flavor my_tiny --network public-net0 vm3
   openstack server create --image cirros --flavor my_tiny --network public-net0 vm4
   openstack server list
   openstack server show vm3
2. SSH to the host where vm3 running and then executing following commands:
   sudo bash
   virsh list
   virsh console $num_vm3
   login in to the vm3
   ifconfig
   ping $IP_vm4

ChenjieXu (midone) wrote :
ChenjieXu (midone) wrote :

Maybe we should modify the installation guide to include how to create VM on different environment (OVS/OVSDPDK). Because we need to set property "hw:mem_page_size=large" to flavor if OVSDPDK is used while no
need to set this property when OVS is used.

Ghada Khalil (gkhalil) wrote :

I added the following note in the container installation guide[0]:
IMPORTANT: When deploying OVS-DPDK, VMs must be configured to use a flavor with property: hw:mem_page_size=large

[0] https://wiki.openstack.org/wiki/StarlingX/Containers/Installation#Configure_the_vswitch_type_.28optional.29

I'm closing this bug as Invalid given this was a procedural error as opposed to a software error.

Changed in starlingx:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers