VXLAN not enabled on StarlingX with containers

Bug #1821135 reported by ChenjieXu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Low
ChenjieXu

Bug Description

Title
-----
VXLAN not enabled on StarlingX with containers

Brief Description
-----------------
By setting up VXLAN datanetwork, assigning IP to interface, creating VXLAN tenant network and creating VM on VXLAN tenant network, the VM can't ping another VM on different host. The tunneling_ip of all ovs agents are the same "172.17.0.1" which is the IP of interface docker0. The tunneling_ip should be the IP assigned to the interface. The tunnel port on br-tun is not created. Thus VXLAN traffic can't go to another host.

Severity
--------
Critical

Steps to Reproduce
------------------
1. On active controller:
   source /etc/platform/openrc
   system host-lock compute-0
   system host-lock compute-1
   system datanetwork-add tenant_vxlan vxlan --multicast_group 224.0.0.1 --ttl 255 --port_num 4789
   system host-if-list -a compute-0
   system host-if-list -a compute-1
   system host-if-modify -m 1500 -n data0 -d tenant_vxlan -c data compute-0 ${DATA0IFUUID}
   system host-if-modify -m 1500 -n data0 -d tenant_vxlan -c data compute-1 ${DATA0IFUUID}
   system host-if-modify --ipv4-mode static compute-0 ${DATA0IFUUID}
   system host-if-modify --ipv4-mode static compute-1 ${DATA0IFUUID}
   system host-addr-add compute-0 ${DATA0IFUUID} 192.168.100.30 24
   system host-addr-add compute-1 ${DATA0IFUUID} 192.168.100.40 24
   system host-unlock compute-0
   system host-unlock compute-1

2. After compute-0 and compute-1 rebooting, on active controller
   export OS_CLOUD=openstack_helm
   ADMINID=`openstack project list | grep admin | awk '{print $2}'`
   openstack network segment range create tenant-vxlan-range --network-type vxlan --minimum 400 --maximum 499 --private --project ${ADMINID}
   neutron net-create --tenant-id ${ADMINID} --provider:network_type=vxlan net1
   neutron subnet-create --tenant-id ${ADMINID} --name subnet1 net1 192.168.101.0/24
   openstack server create --image cirros --flavor m1.tiny --network net1 vm1
   openstack server create --image cirros --flavor m1.tiny --network net1 vm2
   Ensure vm1 and vm2 on different host.
   vm1 ping vm2

Expected Behavior
------------------
vm1 ping vm2 successfully.

Actual Behavior
----------------
vm1 ping vm2 unsuccessfully.

System Configuration
--------------------
System mode: Standard 2+2 on Bare metals

Reproducibility
---------------
100%

Branch/Pull Time/Commit
-----------------------
0306 ISO Image built for OVS DPDK Upgrade

Timestamp/Logs
--------------
+---------------------+-------------------------------------------------------+
| Field | Value |
+---------------------+-------------------------------------------------------+
| admin_state_up | True |
| agent_type | Open vSwitch agent |
| alive | True |
| availability_zone | |
| binary | neutron-openvswitch-agent |
| configurations | { |
| | "integration_bridge": "br-int", |
| | "ovs_hybrid_plug": false, |
| | "in_distributed_mode": false, |
| | "datapath_type": "netdev", |
| | "arp_responder_enabled": true, |
| | "resource_provider_inventory_defaults": { |
| | "min_unit": 1, |
| | "allocation_ratio": 1.0, |
| | "step_size": 1, |
| | "reserved": 0 |
| | }, |
| | "vhostuser_socket_dir": "/var/run/openvswitch", |
| | "resource_provider_bandwidths": {}, |
| | "devices": 5, |
| | "ovs_capabilities": { |
| | "datapath_types": [ |
| | "netdev", |
| | "system" |
| | ], |
| | "iface_types": [ |
| | "dpdk", |
| | "dpdkr", |
| | "dpdkvhostuser", |
| | "dpdkvhostuserclient", |
| | "erspan", |
| | "geneve", |
| | "gre", |
| | "internal", |
| | "ip6erspan", |
| | "ip6gre", |
| | "lisp", |
| | "patch", |
| | "stt", |
| | "system", |
| | "tap", |
| | "vxlan" |
| | ] |
| | }, |
| | "extensions": [], |
| | "l2_population": true, |
| | "tunnel_types": [ |
| | "vxlan" |
| | ], |
| | "log_agent_heartbeats": false, |
| | "enable_distributed_routing": false, |
| | "bridge_mappings": { |
| | "physnet0": "br-phy0" |
| | }, |
| | "tunneling_ip": "172.17.0.1" |
| | } |
| created_at | 2019-03-15 21:13:41 |
| description | |
| heartbeat_timestamp | 2019-03-20 21:25:16 |
| host | compute-0 |
| id | ec3192c9-224a-4c23-8c55-d30ad871a2d3 |
| started_at | 2019-03-20 16:15:27 |
| topic | N/A |
+---------------------+-------------------------------------------------------+

 Bridge br-tun
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port br-tun
            Interface br-tun
                type: internal

Last time install passed
------------------------
n/a

ChenjieXu (midone)
description: updated
Revision history for this message
ChenjieXu (midone) wrote :

The tunnel interface is hard coded as docker0. You can find the code by following command:
   On active controller:
      export OS_CLOUD=openstack_helm
      kubectl -n openstack edit cm neutron-bin

The code is listed below:
    tunnel_interface="docker0"
    if [ -z "${tunnel_interface}" ] ; then
        # search for interface with default routing
        # If there is not default gateway, exit
        tunnel_interface=$(ip -4 route list 0/0 | awk -F 'dev' '{ print $2; exit }' | awk '{ print $1 }') || exit 1
    fi

Revision history for this message
ChenjieXu (midone) wrote :

By changing docker0 to br-phy0, VXLAN can be enabled. But this workaround requires: the bridges "br-phyi" on different compute nodes share a same name. For example:
   compute-0 compute-1
    br-phy0 br-phy0 Work
    br-phy0 br-phy1 Don't Work (br-phy0 on compute-1 may not exist or don't have an IP address)

Revision history for this message
Matt Peters (mpeters-wrs) wrote :

Do you see the local_ip set correctly within the openstack helm overrides?

source /etc/platform/openrc
system helm-override-show neutron openstack

Revision history for this message
Matt Peters (mpeters-wrs) wrote :

Sample neutron helm overrides from a VxLAN system in Wind River attached.

Ghada Khalil (gkhalil)
tags: added: stx.networking
Revision history for this message
ChenjieXu (midone) wrote :

Hi Matt,

I tested VXLAN tenant network again and this time everything works fine.
   The VM on different hosts can ping each other.
   The tunnel port on br-tun has been created.
   local_ip has been set in ovs agent's openvswitch_agent.ini
   tunneling_ip showed by command "neutron agent-show $ovsagent" is correct.
   local_ip has been set correctly within the openstack helm overrides

It seems this bug doesn't occur 100% or maybe I miss some steps previoursly. I think we can wait to see whether Elio can reproduce this bug or not.

Revision history for this message
Ricardo Perez (richomx) wrote :

This issue is reproducible using 2+2 and Duplex configurations Bare Metal

We also observed the following behavior:

* Besides Horizon reports the proper creation of the VMs, and it shows an assigned IP, the VM console can't be opened via Horizon.

*Once that we be able to open the console, using 2 different methods (virsh console and port forwarding / tunneling), we figured out, that the VMs doesn't have an IP assigned for the eth0 interface. I believe this is issue is already described here: https://bugs.launchpad.net/starlingx/+bug/1820378

* We assigned manually the IP to each one of the VMs, and tried the ping without success.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Ricardo/Elio, please re-test the vxlan config again by using the proper VM flavor ("hw:mem_page_size=large") as described in https://bugs.launchpad.net/starlingx/+bug/1820378

Changed in starlingx:
status: New → Incomplete
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as Invalid. Issue was not reproduced by Ricardo/Elio in a month. They have concluded their ovs-dpdk testing and did not report this issue in their final report.

Changed in starlingx:
importance: Undecided → Low
assignee: nobody → ChenjieXu (midone)
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.