devstack-exercises floating_ips broken

Bug #1262785 reported by dkehn
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Jakub Libosvar
Havana
Fix Released
Undecided
Unassigned

Bug Description

Enabling the Q_USE_DEBUG_COMMAND=True in devstack localrc uses the set_neutron_debug using namespace fails. This usually shows as :
[Call Trace]
/opt/stack/new/devstack/exercises/volumes.sh:147:ping_check
/opt/stack/new/devstack/functions:1700:_ping_check_neutron
/opt/stack/new/devstack/lib/neutron:886:die
[ERROR] /opt/stack/new/devstack/lib/neutron:886 [Fail] Couldn't ping server
=====================================================================
SKIP boot_from_volume
SKIP client-env
SKIP marconi
SKIP savanna
SKIP trove
PASS aggregates
PASS bundle
PASS client-args
PASS euca
PASS horizon
PASS sec_groups
PASS swift
FAILED floating_ips
FAILED neutron-adv-test
FAILED volumes
=====================================================================

The env is running devstack in a local environment. With the followinjg localrc:
ubuntu@gate-t1:~/reddwarf/gate-t$ cat /opt/stack/new/devstack/localrc
Q_USE_DEBUG_COMMAND=True
NETWORK_GATEWAY=10.1.0.1
Q_USE_DEBUG_COMMAND=True
Q_PLUGIN=ml2
Q_AGENT=openvswitch
DEST=/opt/stack/new
ACTIVE_TIMEOUT=90
BOOT_TIMEOUT=90
ASSOCIATE_TIMEOUT=60
TERMINATE_TIMEOUT=60
MYSQL_PASSWORD=secret
RABBIT_PASSWORD=secret
ADMIN_PASSWORD=secret
SERVICE_PASSWORD=secret
SERVICE_TOKEN=111222333444
SWIFT_HASH=1234123412341234
ROOTSLEEP=0
ERROR_ON_CLONE=False
ENABLED_SERVICES=g-api,g-reg,key,n-api,n-crt,n-obj,n-cpu,n-sch,horizon,mysql,rabbit,swift,cinder,c-api,c-vol,c-sch,n-cond,neutron,q-svc,q-agt,q-dhcp,q-l3,q-meta
SKIP_EXERCISES=boot_from_volume,client-env
SERVICE_HOST=127.0.0.1
SYSLOG=True
SCREEN_LOGDIR=/opt/stack/new/screen-logs
LOGFILE=/opt/stack/new/devstacklog.txt
VERBOSE=True
FIXED_RANGE=10.1.0.0/24
FIXED_NETWORK_SIZE=256
NETWORK_GATEWAY=10.1.0.1
VIRT_DRIVER=libvirt
SWIFT_REPLICAS=1
export OS_NO_CACHE=True
CINDER_SECURE_DELETE=False
API_RATE_LIMIT=False
VOLUME_BACKING_FILE_SIZE=5G
CINDER_SECURE_DELETE=False

This situation also cause the neutron Grenade(check-grenade-dsvm-neutron) testing to fail as well, see http://logs.openstack.org/63/61663/2/check/check-grenade-dsvm-neutron/5f49167/.

the failing command in floating_ips.sh :
 check_command='while ! sudo /usr/local/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ip netns exec qprobe-24ae41f0-4135-4c67-a16f-2eb5f4c313ec ping -w 1 -c 1 10.1.0.4; do sleep 1; done'
+ timeout 90 sh -c 'while ! sudo /usr/local/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ip netns exec qprobe-24ae41f0-4135-4c67-a16f-2eb5f4c313ec ping -w 1 -c 1 10.1.0.4; do sleep 1; done'
PING 10.1.0.4 (10.1.0.4) 56(84) bytes of data.

--- 10.1.0.4 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

PING 10.1.0.4 (10.1.0.4) 56(84) bytes of data.
From 10.1.0.2 icmp_seq=1 Destination Host Unreachable

--- 10.1.0.4 ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms

After the setup_neutron_debug (https://github.com/openstack-dev/devstack/blob/master/stack.sh#L1110) is called ( ttps://github.com/openstack-dev/devstack/blob/master/lib/neutron#L847), the ovs-vsctl show looks like:
ubuntu@gate-t1:~$ sudo ovs-vsctl show
c6a9fce5-7834-47cd-b92a-8d5a22ba5c87
    Bridge br-ex
        Port "qg-3ac19751-f0"
            Interface "qg-3ac19751-f0"
                type: internal
        Port "tap2d117674-ab"
            Interface "tap2d117674-ab"
                type: internal
        Port br-ex
            Interface br-ex
                type: internal
    Bridge br-int
        Port br-int
            Interface br-int
                type: internal
        Port "tap4fe7e74b-0a"
            tag: 1
            Interface "tap4fe7e74b-0a"
        Port "tap24ae41f0-41"
            tag: 4095
            Interface "tap24ae41f0-41"
                type: internal
        Port "qr-b835f1ef-38"
            tag: 1
            Interface "qr-b835f1ef-38"
                type: internal
        Port "tap23056976-35"
            tag: 1
            Interface "tap23056976-35"
                type: internal
    ovs_version: "1.4.3"

====== which the Port "tap24ae41f0-41" has a tag of 4095 which should be 1, and the 10.1.0.4 ip address which should be private is pingable from the host.

ubuntu@gate-t1:~$ neutron port-list
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------------+
| id | name | mac_address | fixed_ips |
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------------+
| 23056976-352c-4563-9c21-5fd4d5226371 | | fa:16:3e:63:ca:eb | {"subnet_id": "0b9eda0b-fbec-482e-834a-2f2f5cbbf800", "ip_address": "10.1.0.3"} |
| 24ae41f0-4135-4c67-a16f-2eb5f4c313ec | | fa:16:3e:aa:b8:a6 | {"subnet_id": "0b9eda0b-fbec-482e-834a-2f2f5cbbf800", "ip_address": "10.1.0.2"} |
| 2d117674-abe5-4f74-a9e9-5ebc6206a510 | | fa:16:3e:78:49:ab | {"subnet_id": "94e8425c-b6f6-4dbe-970f-dcafd50a1324", "ip_address": "172.24.4.3"} |
| 3ac19751-f063-4bb3-88a0-61374c0b3548 | | fa:16:3e:1b:18:0b | {"subnet_id": "94e8425c-b6f6-4dbe-970f-dcafd50a1324", "ip_address": "172.24.4.2"} |
| 4fe7e74b-0a88-4550-a2ec-f499efda8cb3 | | fa:16:3e:b9:4c:bb | {"subnet_id": "0b9eda0b-fbec-482e-834a-2f2f5cbbf800", "ip_address": "10.1.0.4"} |
| 6d102fae-36b3-4421-b721-b6d231b790ee | | fa:16:3e:dc:48:75 | {"subnet_id": "e7642e63-72c9-44f9-8fc9-34c2f3f9b7b9", "ip_address": "10.10.0.2"} |
| 9af77d01-37fe-4fc2-80eb-1609577176f9 | | fa:16:3e:eb:24:0f | {"subnet_id": "e7642e63-72c9-44f9-8fc9-34c2f3f9b7b9", "ip_address": "10.10.0.4"} |
| 9cbb95d4-c36b-476c-a0ad-fcb21345c230 | | fa:16:3e:ee:93:8a | {"subnet_id": "e7642e63-72c9-44f9-8fc9-34c2f3f9b7b9", "ip_address": "10.10.0.3"} |
| b7189d72-07e1-4497-85e7-cb472fc0c429 | | fa:16:3e:8e:71:2d | {"subnet_id": "0b9eda0b-fbec-482e-834a-2f2f5cbbf800", "ip_address": "10.1.0.5"} |
| b835f1ef-387e-4f0e-aa47-725325e55033 | | fa:16:3e:65:fe:57 | {"subnet_id": "0b9eda0b-fbec-482e-834a-2f2f5cbbf800", "ip_address": "10.1.0.1"} |
+--------------------------------------+------+-------------------+-----------------------------------------------------------------------------------+
ubuntu@gate-t1:~$ neutron port-list -c id -c device_owner
+--------------------------------------+--------------------------+
| id | device_owner |
+--------------------------------------+--------------------------+
| 23056976-352c-4563-9c21-5fd4d5226371 | network:dhcp |
| 24ae41f0-4135-4c67-a16f-2eb5f4c313ec | compute:probe |
| 2d117674-abe5-4f74-a9e9-5ebc6206a510 | compute:probe |
| 3ac19751-f063-4bb3-88a0-61374c0b3548 | network:router_gateway |
| 4fe7e74b-0a88-4550-a2ec-f499efda8cb3 | compute:None |
| 6d102fae-36b3-4421-b721-b6d231b790ee | compute:probe |
| 9af77d01-37fe-4fc2-80eb-1609577176f9 | compute:None |
| 9cbb95d4-c36b-476c-a0ad-fcb21345c230 | network:dhcp |
| b7189d72-07e1-4497-85e7-cb472fc0c429 | compute:None |
| b835f1ef-387e-4f0e-aa47-725325e55033 | network:router_interface |
+--------------------------------------+--------------------------+
ubuntu@gate-t1:~$ ip netns list
qdhcp-a52e7df0-a4e8-4fa2-944c-bc4ad9a9da74
qprobe-6d102fae-36b3-4421-b721-b6d231b790ee
qrouter-920be1f4-87ce-4aba-b4d5-3948ed98fc78
qdhcp-1014c6c7-e061-4a8f-bc0a-b13e9c630dfe
qprobe-24ae41f0-4135-4c67-a16f-2eb5f4c313ec
qprobe-2d117674-abe5-4f74-a9e9-5ebc6206a510
ubuntu@gate-t1:~$

Revision history for this message
dkehn (dekehn) wrote :

from the ovs-vswitchd.log when creating the probe:
Dec 19 20:46:06|00067|bridge|INFO|created port tap594cd882-56 on bridge br-int
Dec 19 20:46:06|00068|netdev_linux|WARN|/sys/class/net/tap594cd882-56/carrier: open failed: No such file or directory
Dec 19 20:46:06|00069|bridge|WARN|bridge br-int: using default bridge Ethernet address 26:27:38:9f:c0:47
Dec 19 20:46:06|00070|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tap21778bfb-ec device failed: No such device
Dec 19 20:46:06|00071|bridge|WARN|bridge br-ex: using default bridge Ethernet address 9a:a5:5e:8d:fd:4f
Dec 19 20:46:06|00072|netdev_linux|WARN|ioctl(SIOCGIFINDEX) on tap21778bfb-ec device failed: No such device
Dec 19 20:46:06|00073|netdev|WARN|failed to get flags for network device tap21778bfb-ec: No such device
Dec 19 20:46:07|00074|netdev_linux|WARN|ethtool command ETHTOOL_GSET on network device tap594cd882-56 failed: No such device
Dec 19 20:46:07|00075|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tap594cd882-56 device failed: No such device
Dec 19 20:46:07|00076|netdev_linux|WARN|ioctl(SIOCGIFINDEX) on tap594cd882-56 device failed: No such device
Dec 19 20:46:07|00077|netdev_linux|WARN|tap594cd882-56: linux-sys get stats failed 19
Dec 19 20:46:07|00078|netdev_linux|WARN|ethtool command ETHTOOL_GDRVINFO on network device tap594cd882-56 failed: No such device
Dec 19 20:46:07|00079|netdev_linux|WARN|ethtool command ETHTOOL_GSET on network device tap594cd882-56 failed: No such device
Dec 19 20:46:07|00080|netdev_linux|WARN|ioctl(SIOCGIFINDEX) on tap21778bfb-ec device failed: No such device
Dec 19 20:46:07|00081|netdev_linux|WARN|tap21778bfb-ec: linux-sys get stats failed 19
Dec 19 20:46:07|00082|netdev_linux|WARN|ethtool command ETHTOOL_GDRVINFO on network device tap21778bfb-ec failed: No such device
Dec 19 20:46:07|00083|netdev_linux|WARN|ethtool command ETHTOOL_GSET on network device tap21778bfb-ec failed: No such device
Dec 19 20:46:12|00084|netdev_linux|WARN|ioctl(SIOCGIFINDEX) on tap594cd882-56 device failed: No such device
Dec 19 20:46:12|00085|netdev_linux|WARN|tap594cd882-56: linux-sys get stats failed 19
Dec 19 20:46:12|00086|netdev_linux|WARN|ethtool command ETHTOOL_GDRVINFO on network device tap594cd882-56 failed: No such device
Dec 19 20:46:12|00087|netdev_linux|WARN|ethtool command ETHTOOL_GSET on network device tap594cd882-56 failed: No such device
Dec 19 20:46:12|00088|netdev_linux|WARN|ioctl(SIOCGIFINDEX) on tap21778bfb-ec device failed: No such device
Dec 19 20:46:12|00089|netdev_linux|WARN|tap21778bfb-ec: linux-sys get stats failed 19
Dec 19 20:46:12|00090|netdev_linux|WARN|ethtool command ETHTOOL_GDRVINFO on network device tap21778bfb-ec failed: No such device

Revision history for this message
dkehn (dekehn) wrote :

By changing Q_PLUGIN=openvswitch from ml2 to openvswitch, two charactistic change in this bug, 1) when do not need https://review.openstack.org/#/c/61663/ do get throught the migration issues assocatied with using ml2, and, 2) we no longer see the tag: 4095 assocaited with an internal VLAN. Using the Q_PLUGIN=openvswitch all internal ports appear as they should with "tag: 1", though the floating_ips is still failing, but volumes and neutron-adv-test PASS.

Changed in neutron:
assignee: nobody → Jakub Libosvar (libosvar)
Revision history for this message
Amir Sadoughi (amir-sadoughi) wrote :

Looking at ovs_neutron_agent, it tags a port as dead with DEAD_VLAN_TAG (4095) if get_device_details (neutron/plugins/ml2/rpc.RpcCallbacks.get_device_details) fails to return a dict with port_id or admin_state_up is False in that dict. Assuming admin_state_up is True, get_device_details can return prematurely due to various failure conditions. Checking the logs for the warnings around those conditions will hopefully get us closer to the problem.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/65774

Changed in neutron:
status: New → In Progress
Revision history for this message
dkehn (dekehn) wrote :

The bug is moving around, do not see the DEAD_VLAN_TAG, but failure stillexists on:
# FIXME (anthony): make xs support security groups
if [ "$VIRT_DRIVER" != "xenserver" -a "$VIRT_DRIVER" != "openvz" ]; then
    # Test we can aren't able to ping our floating ip within ASSOCIATE_TIMEOUT seconds
    ping_check "$PUBLIC_NETWORK_NAME" $FLOATING_IP $ASSOCIATE_TIMEOUT Fail
fi
.
.
-- 172.24.4.4 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.696/0.696/0.696/0.000 ms
+ [[ Fail = \T\r\u\e ]]
+ die 898 '[Fail] Could ping server'
+ local exitcode=1
+ set +o xtrace
[Call Trace]
/opt/stack/new/devstack/exercises/floating_ips.sh:182:ping_check
/opt/stack/new/devstack/functions:1729:_ping_check_neutron
/opt/stack/new/devstack/lib/neutron:898:die
[ERROR] /opt/stack/new/devstack/lib/neutron:898 [Fail] Could ping server

Revision history for this message
Jakub Libosvar (libosvar) wrote :

Together patch for this bug and with https://review.openstack.org/#/c/44596/4 in nova and https://review.openstack.org/#/c/21946/32 in neutron floating_ips exercise passes.

Revision history for this message
Jakub Libosvar (libosvar) wrote :

Sorry for chaotic comment above - let me try again:
When all three patches - https://review.openstack.org/65774 https://review.openstack.org/#/c/44596/4 and https://review.openstack.org/#/c/21946/32 are applied, it makes floating_ips exercise pass.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/havana)

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/66872

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/65774
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=5d02de7d6350329dc592d3b7c14579531837babb
Submitter: Jenkins
Branch: master

commit 5d02de7d6350329dc592d3b7c14579531837babb
Author: Jakub Libosvar <email address hidden>
Date: Thu Jan 9 20:38:37 2014 +0100

    Add binding:host_id when creating port for probe

    When probe is created it needs to have correct bindings otherwise port
    is marked as dead.

    Change-Id: I64441fbe802aab068c129c9647c7144fcd4c50a1
    Partial-Bug: #1262785

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/havana)

Reviewed: https://review.openstack.org/66872
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=5bbc2f0cb2f41979f7671138b24167087154d167
Submitter: Jenkins
Branch: stable/havana

commit 5bbc2f0cb2f41979f7671138b24167087154d167
Author: Jakub Libosvar <email address hidden>
Date: Thu Jan 9 20:38:37 2014 +0100

    Add binding:host_id when creating port for probe

    When probe is created it needs to have correct bindings otherwise port
    is marked as dead.

    Change-Id: If542b715756ccfed548ec89a97742fda2ad9b454
    Partial-Bug: #1262785

tags: added: in-stable-havana
Changed in neutron:
status: In Progress → Fix Committed
Changed in neutron:
importance: Undecided → Medium
milestone: none → icehouse-3
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: icehouse-3 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.