Comment 1 for bug 1682681

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks Thomas for reassigning, and hi Bartłomeij,
Btw - I'd recommend very much to user virtio over rtl driver anyway [1], but that is not the point here.

Thanks for retesting with brctl and moving OVS out of the equation already.
The difference certainly is within the guest drivers for that network card between the 14.04 and 16.04 guest.

I checked the changes we had in between the respective kernels and there were not that much for the drivers themselves at least. Mostly bug fixes and while it could be anything else in the kernels this is certainly worth a quick test. There is one in particular which could be interesting that enabled TSO offloading by default.
You can check in your guests with
 $ ethtool -k <device>
what the offloads currently are.
Please check if more differ than just TSO (usually the list grows the newer things get).
Then on the 16.04 guest modify the config one-by-one to match the one you have seen with the 14.04 guest.
If you happen to find a single offload feature that switches good/bad behavior get back here.

Furthermore we can exclude other packages here by using the HWE kernels [2]. Could you confirm that the 14.04 guest with the HWE-x kernel booted shows the same bad behavior?
That would exclude anything of 16.04 other than the kernel to cause the difference.

If so it would be great to further shrink the range we are looking at by trying HWE-
To do so in your case take the 14.04 the guest and install the packages
linux-image-virtual-lts-utopic, linux-image-virtual-lts-vivid, linux-image-virtual-lts-wily, linux-image-virtual-lts-xenial. Then modify the boot loader (or interactively select at the prompt) to boot one after the other and check your results as well as the maybe related offload settings of above.

Also to better reproduce this could you outline what kind/direction of workload you are testing
- Is it Guest-to-Guest or Traffic from the outside?
- What is the network traffic you are using, can it be in archive net tools or only a custom workload in your setup?

Summary:
- please verify that the same happens in 14.04 + HWE+x kernel (go on with that setup if it shows the issue)
- please check different HWE level which is the first to show the issue (go on with the oldest of the HWE kernels that show the issue)
- please compare and test different offload settings as outlined above
- please describe your workload more in Details so that we can try to reproduce

[1]: http://www.linux-kvm.org/page/Tuning_KVM
[2]: https://wiki.ubuntu.com/Kernel/LTSEnablementStack