Comment 1 for bug 1771480

Gavin Guo (mimi0213kimo) wrote :

This occurred during the deployment of the DPDK compute. The DPDK was not deployed. The bond0 and bond1 configuration appears to have failed.

-----------8<-----------
Preliminary analysis: the LRO appears to be off after the kernel traces, despite the message. I am inquiring as to the nature of the failure.

This bug appears to be relevant:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1660146

All NICs involved:
driver: ixgbe
version: 3.15.1-k
firmware-version: 0x800008bd
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

Bonds:
bond0: Adding slave enp131s0f1.
bond0: Adding slave enp3s0f0.

bond1: Adding slave enp131s0f0.
bond1: Adding slave enp3s0f1.

Overview of dmesg:

[ 27.423698] bonding: bond0: link status definitely up for interface enp131s0f1, 10000 Mbps full duplex.
[ 27.423704] bonding: bond0: link status definitely up for interface enp3s0f0, 10000 Mbps full duplex.
[ 27.423711] bonding: bond1: link status definitely up for interface enp3s0f1, 10000 Mbps full duplex.
[ 27.423716] bonding: bond1: link status definitely up for interface enp131s0f0, 10000 Mbps full duplex.

[ 27.434165] device bond0.2004 entered promiscuous mode
[ 27.434692] ------------[ cut here ]------------
[ 27.434704] WARNING: CPU: 1 PID: 3942 at /build/linux-90Gc2C/linux-3.13.0/net/core/dev.c:1433 dev_disable_lro+0x87/0x90()
[ 27.434708] netdevice: bond0.2004
[ 27.434708] failed to disable LRO!

[ 27.523777] bonding: bond0: link status down for interface enp131s0f1, disabling it in 1000 ms.
[ 28.916892] ixgbe 0000:83:00.1 enp131s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 28.920854] bonding: bond0: link status up again after 1000 ms for interface enp131s0f1.
[ 29.020935] bonding: bond0: link status down for interface enp3s0f0, disabling it in 1000 ms.
[ 29.401337] ixgbe 0000:03:00.0 enp3s0f0: detected SFP+: 3
[ 29.541388] ixgbe 0000:03:00.0 enp3s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 29.621400] bonding: bond0: link status up again after 600 ms for interface enp3s0f0.

[ 29.338269] device bond1.2001 entered promiscuous mode
[ 29.338689] ------------[ cut here ]------------
[ 29.338699] WARNING: CPU: 0 PID: 3944 at /build/linux-90Gc2C/linux-3.13.0/net/core/dev.c:1433 dev_disable_lro+0x87/0x90()
[ 29.338702] netdevice: bond1.2001
[ 29.338702] failed to disable LRO!

[ 29.429252] bonding: bond1: link status down for interface enp3s0f1, disabling it in 1000 ms.
[ 29.824836] ixgbe 0000:03:00.1 enp3s0f1: detected SFP+: 4
[ 29.829564] bonding: bond1: link status down for interface enp131s0f0, disabling it in 1000 ms.
[ 30.069790] ixgbe 0000:03:00.1 enp3s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 30.129796] bonding: bond1: link status up again after 700 ms for interface enp3s0f1.
[ 31.244068] IPv6: ADDRCONF(NETDEV_UP): br-mesh: link is not ready
[ 31.246724] ixgbe 0000:83:00.0 enp131s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 31.250679] bonding: bond1: link status up again after 1000 ms for interface enp131s0f0.
[ 33.324580] device bond0 entered promiscuous mode
[ 33.324585] device enp131s0f1 entered promiscuous mode
[ 33.324995] device enp3s0f0 entered promiscuous mode
[ 33.337620] device bond1 entered promiscuous mode
[ 33.337624] device enp3s0f1 entered promiscuous mode
[ 33.338094] device enp131s0f0 entered promiscuous mode
-----------8<-----------
I reproduced the trace in a KVM vm running Trusty/3.13. Info below:
ubuntu@new-urchin:~$ uname -a
Linux new-urchin 3.13.0-145-generic #194-Ubuntu SMP Thu Apr 5 15:20:44 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@new-urchin:~$ dpkg -l | grep -E "vlan|ifenslave|bridge-utils"
ii bridge-utils 1.5-6ubuntu2 amd64 Utilities for configuring the Linux Ethernet bridge
ii ifenslave 2.4ubuntu1.2 all configure network interfaces for parallel routing (bonding)
ii vlan 1.9-3ubuntu10.5 amd64 user mode programs to enable VLANs on your ethernet devices

I added two extra e1000 NICs to the VM (eth1 and eth2). Both were configured to attach to a host-only network.

Reproducer build:

apt install ifenslave vlan bridge-utils
echo bonding >> /etc/modules

Used this /e/n/i file: << EOF
auto lo
iface lo inet loopback
dns-nameservers 192.168.200.2
dns-search maas

auto ens6
iface ens6 inet static
dns-nameservers 192.168.200.2
gateway 192.168.200.1
address 192.168.200.65/24
mtu 1500

# Reproducer relevant config
auto bond0.2004
iface bond0.2004 inet manual
mtu 9100
vlan-raw-device bond0
auto bond0
iface bond0 inet manual
mtu 9100
bond-slaves eth1 eth2
bond-mode 802.3ad
bond-miimon 100
bond-use-carrier 1
bond-lacp-rate fast
bond-updelay 3000
bond-downdelay 1000
bond-ad-select bandwidth
bond-xmit-hash-policy layer3+4
post-up sleep 45
auto br-mesh
iface br-mesh inet static
bridge_ports bond0.2004
address 192.168.210.3/24
auto eth1
iface eth1 inet manual
mtu 9100
bond-master bond0
auto eth2
iface eth2 inet manual
mtu 9100
bond-master bond0

source /etc/network/interfaces.d/*.cfg
EOF

Continued in Part 2
Part 2:

Rebooted, got this trace:
[ 3.832420] device bond0.2004 entered promiscuous mode
[ 3.832447] ------------[ cut here ]------------
[ 3.832452] WARNING: CPU: 0 PID: 1196 at /build/linux-dwxnzD/linux-3.13.0/net/core/dev.c:1433 dev_disable_lro+0x87/0x90()
[ 3.832453] netdevice: bond0.2004
[ 3.832453] failed to disable LRO!
[ 3.832454] Modules linked in: bridge 8021q garp stp mrp llc kvm_intel kvm serio_raw i2c_piix4 bonding mac_hid qxl psmouse e1000 virtio_scsi ttm drm_kms_helper drm pata_acpi floppy
[ 3.832467] CPU: 0 PID: 1196 Comm: brctl Not tainted 3.13.0-145-generic #194-Ubuntu
[ 3.832468] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1~cloud0 04/01/2014
[ 3.832469] 0000000000000000 ffff88003cad1c70 ffffffff81737607 ffff88003cad1cb8
[ 3.832471] 0000000000000009 ffff88003cad1ca8 ffffffff8106c1cd ffff88003cca1000
[ 3.832473] ffff88003cb80880 ffff88003cca1000 0000000000000000 ffff88003cbb3540
[ 3.832474] Call Trace:
[ 3.832478] [<ffffffff81737607>] dump_stack+0x64/0x80
[ 3.832481] [<ffffffff8106c1cd>] warn_slowpath_common+0x7d/0xa0
[ 3.832483] [<ffffffff8106c23c>] warn_slowpath_fmt+0x4c/0x50
[ 3.832493] [<ffffffff8163bb37>] dev_disable_lro+0x87/0x90
[ 3.832499] [<ffffffffa01f1223>] br_add_if+0x1f3/0x430 [bridge]
[ 3.832502] [<ffffffffa01f1c5d>] add_del_if+0x5d/0x90 [bridge]
[ 3.832506] [<ffffffffa01f254b>] br_dev_ioctl+0x5b/0x90 [bridge]
[ 3.832512] [<ffffffff8164e38e>] dev_ifsioc+0x31e/0x370
[ 3.832518] [<ffffffff81634fd9>] ? dev_get_by_name_rcu+0x69/0x90
[ 3.832521] [<ffffffff8164e4d1>] dev_ioctl+0xf1/0x590
[ 3.832526] [<ffffffff811e0dce>] ? evict+0x11e/0x1b0
[ 3.832529] [<ffffffff8161bfad>] sock_do_ioctl+0x4d/0x60
[ 3.832531] [<ffffffff8161c4e8>] sock_ioctl+0x1f8/0x2d0
[ 3.832534] [<ffffffff817487a7>] ? system_call_after_swapgs+0x141/0x170
[ 3.832536] [<ffffffff811d8bc3>] do_vfs_ioctl+0x2e3/0x4d0
[ 3.832538] [<ffffffff8174877d>] ? system_call_after_swapgs+0x117/0x170
[ 3.832540] [<ffffffff81748776>] ? system_call_after_swapgs+0x110/0x170
[ 3.832542] [<ffffffff8174876f>] ? system_call_after_swapgs+0x109/0x170
[ 3.832544] [<ffffffff81748768>] ? system_call_after_swapgs+0x102/0x170
[ 3.832546] [<ffffffff81748761>] ? system_call_after_swapgs+0xfb/0x170
[ 3.832548] [<ffffffff8174875a>] ? system_call_after_swapgs+0xf4/0x170
[ 3.832550] [<ffffffff81748753>] ? system_call_after_swapgs+0xed/0x170
[ 3.832552] [<ffffffff8174874c>] ? system_call_after_swapgs+0xe6/0x170
[ 3.832554] [<ffffffff811d8e31>] SyS_ioctl+0x81/0xa0
[ 3.832556] [<ffffffff8174871b>] ? system_call_after_swapgs+0xb5/0x170
[ 3.832558] [<ffffffff817487f0>] system_call_fastpath+0x1a/0x1f
[ 3.832559] ---[ end trace 996acb2f420d9ef7 ]---

Continued in part 3
Part: 3

Output of ip a:
ubuntu@new-urchin:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:4e:bb:3c brd ff:ff:ff:ff:ff:ff
inet 192.168.200.65/24 brd 192.168.200.255 scope global ens6
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe4e:bb3c/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9100 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 52:54:00:09:00:6d brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9100 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 52:54:00:09:00:6d brd ff:ff:ff:ff:ff:ff
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9100 qdisc noqueue state UP group default
link/ether 52:54:00:09:00:6d brd ff:ff:ff:ff:ff:ff
inet6 fe80::5054:ff:fe09:6d/64 scope link
valid_lft forever preferred_lft forever
6: bond0.2004@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9100 qdisc noqueue master br-mesh state UP group default
link/ether 52:54:00:09:00:6d brd ff:ff:ff:ff:ff:ff
inet6 fe80::5054:ff:fe09:6d/64 scope link
valid_lft forever preferred_lft forever
7: br-mesh: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9100 qdisc noqueue state UP group default
link/ether 52:54:00:09:00:6d brd ff:ff:ff:ff:ff:ff
inet 192.168.210.3/24 brd 192.168.210.255 scope global br-mesh
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe09:6d/64 scope link
valid_lft forever preferred_lft forever