ipv6 neighbor discovery broken (on a bridge)

Bug #1597806 reported by LaMont Jones
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

I have a xenial (4.4.0-24-generic) machine that loses ipv6 connectivity every time I reboot the gateway it uses.

br0 is a bridge which has eth0.2 as its only member, with (currently) 6 "scope global temporary deprecated dynamic" (privacy) addresses, and:
    inet6 2601:282:8100:3500:24c:40ff:fe1a:c570/64 scope global mngtmpaddr dynamic
       valid_lft 300sec preferred_lft 120sec

The tcpdump trace on against eth0.2 of the broken machine: http://paste.ubuntu.com/18170606/ (fe80::1 is the gateway)

http://paste.ubuntu.com/18173670/ is the output of lspci -vvn on one of the (quad) interfaces on the machine.

Setting the bridge to promisc and turning it back off works around the issue. Tcpdump on the underlying eth0.2 does not.

On another (yakkety) box, running 4-4-0.14-generic, I also see the problem: that interface is also br0, with eth0 (untagged) as its only member.

All of the above leads me to believe that the kernel is not managing to correctly set up (at least some?) of the multicast addresses it needs to listen to on the bridge.

Revision history for this message
LaMont Jones (lamont) wrote :

One other tidbit: The gateway in question is using eth0.2 on a trunked port to a Cisco WS-C2960G-24TC-L running c2960-lanbasek9-mz.150-2.SE9.bin.

Revision history for this message
LaMont Jones (lamont) wrote :

However, given that eth0.2 is seeing the solicitation, and br0 isn't, I'm leaning strongly toward the bridge code being slightly buggy.

My next test will be to do a non-promisc tcpdump on eth0.2 and see if _THAT_ sees the traffic. That may take me a week before I can do it though, due to schedules and the location of the machines.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1597806

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
LaMont Jones (lamont) wrote :

Following apw's excellent advice pointing to https://www.v13.gr/blog/?p=378 and abusing another ipv6 address on the afflicted host:

tcpdump -npi br0 ip6 and not port 22 | grep -i neigh
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes

[ fire up ping6 2601:282:8100:3500:641d:9dd0:5b50:46e6 on the gateway.]
[ no traffic shows up on the bridge ]
[ echo 0 > /sys/devices/virtual/net/br0/bridge/multicast_snooping ]

09:40:49.260756 IP6 fe80::1 > ff02::1:ff50:46e6: ICMP6, neighbor solicitation, who has 2601:282:8100:3500:641d:9dd0:5b50:46e6, length 32
09:40:49.260813 IP6 2601:282:8100:3500:641d:9dd0:5b50:46e6 > fe80::1: ICMP6, neighbor advertisement, tgt is 2601:282:8100:3500:641d:9dd0:5b50:46e6, length 32

Looks like bridge multicast_snooping causes ipv6 neighbor discovery to break.

Revision history for this message
LaMont Jones (lamont) wrote :

apport-collect is not really an option on the machine, nor should any of that really provide any additional information that's not already in the bug report. holler at me if I missed anything relevant.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.7 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.7-rc5-yakkety/

tags: added: kernel-da-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The current mainline kernel is now v4.8 final. That would be better to test, versus 4.7-rc5:

v4.8: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.