VLAN support broken

Bug #658460 reported by magellan
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

VLAN support seems broken in linux-image-2.6.35-22-generic-pae.

I can successfully add a VLAN to an interface but no traffic goes trought that VLAN interface.

To be more precise, outgoing traffic is going well but incoming traffic never arrives the VLAN interface (byte counter for incoming remains 0). Sniffing the network from another computer indicates that traffic exists but it seems to never reach the VLAN interface.

As an example, if I run dhclient on the VLAN interface, I see on the DHCP server that the machine is performing DISCOVERY on the network and the server make an OFFER, but this one never reach the DHCP client that keeps making DISCOVERY requests.

Falling back to the previous kernel with the exact same interfaces configuration file and everything works fine.

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: linux-image-2.6.35-22-generic-pae 2.6.35-22.33
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-25.44-generic-pae 2.6.32.21+drm33.7
Uname: Linux 2.6.32-25-generic-pae i686
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: gea 2481 F.... pulseaudio
CRDA: Error: [Errno 2] Aucun fichier ou dossier de ce type
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xd4900000 irq 17'
   Mixer name : 'Intel G45 DEVIBX'
   Components : 'HDA:111d7603,103c1724,00100202 HDA:11c11040,103c3066,00100200 HDA:80862804,80860101,00100000'
   Controls : 23
   Simple ctrls : 14
Date: Mon Oct 11 16:55:23 2010
HibernationDevice: RESUME=UUID=fbbe1f5a-da73-41c2-9648-5d689d327b25
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
MachineType: Hewlett-Packard HP ProBook 6540b
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-25-generic-pae root=UUID=b3bdee91-3bbe-4ba0-a26e-bfc97d8099c0 ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=fr_CH.utf8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.38
SourcePackage: linux
dmi.bios.date: 01/27/2010
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 68CDD Ver. F.04
dmi.board.name: 1722
dmi.board.vendor: Hewlett-Packard
dmi.board.version: KBC Version 29.2B
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-Packard:bvr68CDDVer.F.04:bd01/27/2010:svnHewlett-Packard:pnHPProBook6540b:pvr:rvnHewlett-Packard:rn1722:rvrKBCVersion29.2B:cvnHewlett-Packard:ct10:cvr:
dmi.product.name: HP ProBook 6540b
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
magellan (swissmage) wrote :
Revision history for this message
Bill Michaelson (t-launchpad-bill-from-net) wrote :

I can confirm precisely these symptoms with 2.6.35-22. I can see the VLAN packets going out, and I can see the responses leaving a remote system (ARP in this case), but they (responses) do not appear on the Maverick box interface according to tcpdump.

Reverting to 2.6.32 (Lucid) solves problem.

I don't know if this is related but I observed other problems with the NICs while running this release, such that they appeared to go out of service completely. But I don't have much useful information about this.

This bug affects me because I need to use VLAN with bridging to isolate networks of KVM VMs.

Revision history for this message
Tais P. Hansen (taisph) wrote :

These symptoms seems similar to mine with a little extra twist.

When adding a vlan to a bonding interface, all vlan tagged broadcasts seems to arrive twice on the bonding interface according to tcpdump. Remove the vlan and the double packets stop.

Ie.

modprobe bonding mode=1 miimon=100
ip link set bond0 up
ifenslave bond0 eth0 eth1
tcpdump -i bond0 -nes0
<normal vlan tagged traffic observed>
vconfig add bond0 1000
tcpdump -i bond0 -nes0
<vlan tagged broadcasts appears twice in dump>

This MAY be the cause of the periodical network outages I experience with kvm guests on a Maverick host. It seems like arp req/replies aren't handled correctly/at all.

Revision history for this message
magellan (swissmage) wrote :

I performed a fresh install of Ubuntu 10.10 Desktop on the laptop and the problem remains exactly the same.

Tagged outgoing traffic is working fine. Incoming tagged traffic showing up on the physical interface (eth0) but not on the VLAN interface (eth0.x).

Untagged traffic is working fine.

Revision history for this message
magellan (swissmage) wrote :

It seems that nobody cares...

For those who also have this issue, falling back to the 10.04 kernel seems to be a valid workaround.

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Michael Milligan (milli) wrote :

VLAN interfaces (e.g., eth0.2) work for me in 2.6.35-28-generic, but are broken past that. 2.6.37-* is broken as well as 2.6.38-* in (soon to be released) natty. Sporadic packet loss in reception of packets. Seems they get lost in the stack somewhere. Turning off generic-segmentation-offload and generic-receive-offload as suggested from a Google search help a little bit, but problems with packet loss still remain.

I'm surprised this isn't a big deal since I know a lot of folks run Ubuntu on their servers!

Revision history for this message
MMeija (mmeija) wrote :

2.6.35-28-server vlan is also currently working for me; where 2.6.38-8-server does not

Revision history for this message
Shawnl (shawnl) wrote :

This is still un-assigned and doesn't appear to have been worked on. I rely heavily on vlan support. Has there been any movement on this, or is there at least a work-around in the works?

Revision history for this message
BDV (bdv) wrote :

vlan support is still broken in the oneiric beta.

But ethtool does give a clue:

ethtool -k eth0
rx-vlan-offload: on
tx-vlan-offload: on

Older kernels (2.6.32-33) do not have vlan-offload support. The vlan support in Ubuntu fails once you use a kernel which does have vlan offloading. Ethtool fails to disable the vlan offloading.

It would be nice the have vlan support in the next LTS release (12.04). Vlan support is a enterprise feature, people running large networks need it. vlan offloading support would be very nice.

Revision history for this message
BDV (bdv) wrote :

A bit more testing.

The vlan-offload support does not make any difference (tested on old hardware)

Installing kernel 2.6.38-10 (from lucid-updates) on a running lucid (10.04) system did break the vlan interface and a network bridge running on a vlan interface.

vlan support is now working on my oneiric system (a switch configuration error, my mistake). There is still a bug regarding vlans over a bridge, but that's documented in Bug #771209

Revision history for this message
penalvch (penalvch) wrote :

magellan, thank you for reporting this and helping make Ubuntu better. Maverick reached EOL on April 2012.
Please see this document for currently supported Ubuntu releases:
https://wiki.ubuntu.com/Releases

We were wondering if this is still an issue on a supported release? If so, can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue in a supported release, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Rudd-O (rudd-o) wrote :

On Fedora 3.3 kernel I see the same problem. tcpdumping the physical network interface eth0, I can see the DHCP replies coming through clearly, on VLAN 103, but tcpdumping eth0.103 (the trunked VLAN interface #103 associated to eth0), the DHCP replies simply do not appear. In both tcpdumps, the DHCP requests appear.

So this is not fixed, and it goes beyond Ubuntu.

Revision history for this message
penalvch (penalvch) wrote :

Rudd-O, please do not make comments on this report. For more on this, please see https://help.ubuntu.com/community/ReportingBugs#A3._Make_sure_the_bug_hasn.27t_already_been_reported . If you are having a problem in Ubuntu, please file a new report by executing the following via the Terminal and feel free to subscribe me to it:
ubuntu-bug linux

Thanks!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.