WARNING: at /build/buildd/linux-3.2.0/net/core/dev.c:1960 skb_gso_segment+0x341/0x3b0()

Bug #1014350 reported by Bane Ivosev
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Low
Unassigned

Bug Description

we have IBM 3650 with 12.04 amd64 server as a kvm virtualization and kern.log is suddenly fill at dramatic rate with this warning. system work normaly for a while and problem start five days ago. any help?

WORKAROUND: We have several freebsd 9 amd64 guests with virtio network drivers. After changing net drv to e1000 the problem disappears.

[29955.815267] WARNING: at /build/buildd/linux-3.2.0/net/core/dev.c:1980 skb_gso_segment+0x341/0x3b0()
[29955.815269] Hardware name: IBM System x3650 -[7979KPG]-
[29955.815272] 802.1Q VLAN Support: caps=(0x30095823, 0x0) len=1925 data_len=0 ip_summed=0
[29955.815275] Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables kvm_intel kvm nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc bridge 8021q garp stp radeon ttm ibmpex drm_kms_helper shpchp ibmaem ipmi_msghandler ics932s401 psmouse drm serio_raw i2c_algo_bit i5000_edac edac_core i5k_amb ioatdma mac_hid dca lp parport ses enclosure bnx2 aacraid
[29955.815312] Pid: 7912, comm: kvm Tainted: G W 3.2.0-25-generic #40-Ubuntu
[29955.815314] Call Trace:
[29955.815316] <IRQ> [<ffffffff810672af>] warn_slowpath_common+0x7f/0xc0
[29955.815322] [<ffffffff810673a6>] warn_slowpath_fmt+0x46/0x50
[29955.815326] [<ffffffff8153ebd1>] skb_gso_segment+0x341/0x3b0
[29955.815330] [<ffffffff8105695e>] ? update_curr+0x21e/0x230
[29955.815335] [<ffffffff8154234a>] dev_hard_start_xmit+0x11a/0x580
[29955.815341] [<ffffffffa021c150>] ? br_flood+0xc0/0xc0 [bridge]
[29955.815345] [<ffffffff81542a5a>] dev_queue_xmit+0x2aa/0x420
[29955.815350] [<ffffffffa021c1bc>] br_dev_queue_push_xmit+0x6c/0xa0 [bridge]
[29955.815356] [<ffffffffa021c248>] br_forward_finish+0x58/0x60 [bridge]
[29955.815361] [<ffffffffa021c3fb>] __br_forward+0xab/0xd0 [bridge]
[29955.815367] [<ffffffffa021c4bd>] br_forward+0x5d/0x70 [bridge]
[29955.815372] [<ffffffffa021d182>] br_handle_frame_finish+0x182/0x2a0 [bridge]
[29955.815378] [<ffffffffa021d468>] br_handle_frame+0x1c8/0x270 [bridge]
[29955.815384] [<ffffffffa021d2a0>] ? br_handle_frame_finish+0x2a0/0x2a0 [bridge]
[29955.815388] [<ffffffff8153fd82>] __netif_receive_skb+0x1e2/0x520
[29955.815391] [<ffffffff815404e1>] process_backlog+0xb1/0x190
[29955.815395] [<ffffffff815417d4>] net_rx_action+0x134/0x290
[29955.815399] [<ffffffff8106ea58>] __do_softirq+0xa8/0x210
[29955.815402] [<ffffffff81667eac>] call_softirq+0x1c/0x30
[29955.815404] <EOI> [<ffffffff81015305>] do_softirq+0x65/0xa0
[29955.815410] [<ffffffff81541cb8>] netif_rx_ni+0x28/0x30
[29955.815414] [<ffffffff814787c6>] tun_get_user+0x306/0x4a0
[29955.815417] [<ffffffff81479ca4>] tun_chr_aio_write+0x64/0x90
[29955.815421] [<ffffffff81479c40>] ? tun_chr_aio_read+0xd0/0xd0
[29955.815424] [<ffffffff811784c3>] do_sync_readv_writev+0xd3/0x110
[29955.815428] [<ffffffff812d7768>] ? apparmor_file_permission+0x18/0x20
[29955.815431] [<ffffffff8129cefc>] ? security_file_permission+0x2c/0xb0
[29955.815435] [<ffffffff81177bc1>] ? rw_verify_area+0x61/0xf0
[29955.815438] [<ffffffff81178794>] do_readv_writev+0xd4/0x1d0
[29955.815441] [<ffffffff811be755>] ? eventfd_ctx_read+0x1a5/0x210
[29955.815445] [<ffffffff811788cc>] vfs_writev+0x3c/0x50
[29955.815447] [<ffffffff81178a2a>] sys_writev+0x4a/0xb0
[29955.815451] [<ffffffff81665c42>] system_call_fastpath+0x16/0x1b
[29955.815453] ---[ end trace 33dfb9cf0396c073 ]---

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1014350

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It would be great if you could run apport-collect as requested in comment #1. That will provided additional details about your system.

Also, would it be possible for you to boot the previous kernel[0], which should be 3.2.0-24.39 and report back if this bug still exists?

[0] https://launchpad.net/ubuntu/+source/linux/3.2.0-24.39

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Revision history for this message
David (mardraum) wrote :
Download full text (13.8 KiB)

I also get similar on a couple of test kvm hosts using vlans and bridge interfaces for the vm's. I also tried upgrading to the testing release and got similar output, see below for both 12.04 and testing:

Jun 10 21:04:47 host1 kernel: [ 6518.495373] WARNING: at /build/buildd/linux-3.2.0/net/core/dev.c:1960 skb_gso_segment+0x341/0x3b0()
Jun 10 21:04:47 host1 kernel: [ 6518.495375] Hardware name: TECRA A9
Jun 10 21:04:47 host1 kernel: [ 6518.495377] 802.1Q VLAN Support: caps=(0x20115829, 0x0) len=2076 data_len=0 ip_summed=0
Jun 10 21:04:47 host1 kernel: [ 6518.495379] Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables kvm_intel kvm nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc ext2 tpm_infineon bridge joydev 8021q garp stp pcmcia arc4 i915 dm_multipath yenta_socket snd_hda_codec_si3054 snd_hda_codec_realtek psmouse drm_kms_helper drm pcmcia_rsrc serio_raw iwl4965 pcmcia_core tifm_7xx1 i2c_algo_bit tifm_core iwl_legacy mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc cfg80211 video tpm_tis toshiba_acpi sparse_keymap wmi mac_hid lp parport sdhci_pci sdhci e1000e
Jun 10 21:04:47 host1 kernel: [ 6518.495420] Pid: 3634, comm: kvm Tainted: G W 3.2.0-24-generic #39-Ubuntu
Jun 10 21:04:47 host1 kernel: [ 6518.495422] Call Trace:
Jun 10 21:04:47 host1 kernel: [ 6518.495423] <IRQ> [<ffffffff8106725f>] warn_slowpath_common+0x7f/0xc0
Jun 10 21:04:47 host1 kernel: [ 6518.495429] [<ffffffff8155e8e5>] ? sch_direct_xmit+0x85/0x1d0
Jun 10 21:04:47 host1 kernel: [ 6518.495432] [<ffffffff81067356>] warn_slowpath_fmt+0x46/0x50
Jun 10 21:04:47 host1 kernel: [ 6518.495435] [<ffffffff8153e051>] skb_gso_segment+0x341/0x3b0
Jun 10 21:04:47 host1 kernel: [ 6518.495441] [<ffffffffa037699a>] ? br_nf_dev_queue_xmit+0x2a/0x90 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495446] [<ffffffff81541b3a>] dev_hard_start_xmit+0x11a/0x580
Jun 10 21:04:47 host1 kernel: [ 6518.495451] [<ffffffffa0370234>] ? br_forward_finish+0x44/0x60 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495454] [<ffffffff8154224a>] dev_queue_xmit+0x2aa/0x420
Jun 10 21:04:47 host1 kernel: [ 6518.495458] [<ffffffff8165c885>] ? _raw_read_unlock_bh+0x15/0x20
Jun 10 21:04:47 host1 kernel: [ 6518.495463] [<ffffffffa03701bc>] br_dev_queue_push_xmit+0x6c/0xa0 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495469] [<ffffffffa037699a>] br_nf_dev_queue_xmit+0x2a/0x90 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495474] [<ffffffffa0377500>] br_nf_post_routing+0x280/0x2e0 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495479] [<ffffffff8156c065>] nf_iterate+0x85/0xc0
Jun 10 21:04:47 host1 kernel: [ 6518.495484] [<ffffffffa0370150>] ? br_flood+0xc0/0xc0 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495487] [<ffffffff8156c115>] nf_hook_slow+0x75/0x150
Jun 10 21:04:47 host1 kernel: [ 6518.495492] [<ffffffffa0370150>] ? br_flood+0xc0/0xc0 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495497] [<ffffffffa03701f0>] ? br_dev_queue_push_xmit+0xa0/0xa0 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518.495503] [<ffffffffa0376a00>] ? br_nf_dev_queue_xmit+0x90/0x90 [bridge]
Jun 10 21:04:47 host1 kernel: [ 6518....

Revision history for this message
David (mardraum) wrote :
Download full text (6.9 KiB)

Getting the same thing with latest quantal fresh install on the host from today:

Jul 7 00:08:07 host2 kernel: [ 3971.352402] WARNING: at /build/buildd/linux-3.5.0/net/core/dev.c:1888 skb_warn_bad_offload+0xc2/0xcf()
Jul 7 00:08:07 host2 kernel: [ 3971.352406] Hardware name: TECRA A9
Jul 7 00:08:07 host2 kernel: [ 3971.352412] : caps=(0x0000000020115829, 0x0000000000000000) len=1732 data_len=0 gso_size=1448 gso_type=5 ip_summed=0
Jul 7 00:08:07 host2 kernel: [ 3971.352415] Modules linked in: vhost_net macvtap macvlan ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables bridge nfsd nfs lockd fscach
e auth_rpcgss nfs_acl sunrpc ext2 8021q garp stp llc arc4 snd_hda_codec_si3054 snd_hda_codec_realtek i915 iwl4965 iwlegacy snd_hda_intel snd_hda_codec mac80211 snd_hwdep snd_pcm pcmcia gpio_ich drm_kms_
helper drm tpm_infineon tifm_7xx1 snd_timer joydev snd cfg80211 soundcore dm_multipath snd_page_alloc lpc_ich coretemp kvm_intel kvm yenta_socket pcmcia_rsrc i2c_algo_bit pcmcia_core tifm_core microcode
 psmouse scsi_dh tpm_tis mac_hid toshiba_acpi serio_raw sparse_keymap wmi video lp parport sdhci_pci sdhci e1000e
Jul 7 00:08:07 host2 kernel: [ 3971.352531] Pid: 4203, comm: vhost-4202 Tainted: G W 3.5.0-3-generic #3-Ubuntu
Jul 7 00:08:07 host2 kernel: [ 3971.352535] Call Trace:
Jul 7 00:08:07 host2 kernel: [ 3971.352538] <IRQ> [<ffffffff81051c0f>] warn_slowpath_common+0x7f/0xc0
Jul 7 00:08:07 host2 kernel: [ 3971.352554] [<ffffffff81051d06>] warn_slowpath_fmt+0x46/0x50
Jul 7 00:08:07 host2 kernel: [ 3971.352562] [<ffffffff81674a7e>] ? _raw_spin_lock+0xe/0x20
Jul 7 00:08:07 host2 kernel: [ 3971.352570] [<ffffffff816719a4>] skb_warn_bad_offload+0xc2/0xcf
Jul 7 00:08:07 host2 kernel: [ 3971.352579] [<ffffffff81566fc0>] ? dev_queue_xmit+0x1c0/0x650
Jul 7 00:08:07 host2 kernel: [ 3971.352587] [<ffffffff815639f1>] skb_gso_segment+0x221/0x290
Jul 7 00:08:07 host2 kernel: [ 3971.352595] [<ffffffff815669d0>] dev_hard_start_xmit+0x200/0x630
Jul 7 00:08:07 host2 kernel: [ 3971.352604] [<ffffffff8106fde5>] ? queue_work_on+0x25/0x30
Jul 7 00:08:07 host2 kernel: [ 3971.352630] [<ffffffffa043597e>] ? rpc_make_runnable+0x7e/0x80 [sunrpc]
Jul 7 00:08:07 host2 kernel: [ 3971.352655] [<ffffffffa0435aa2>] ? rpc_wake_up_task_queue_locked+0x122/0x210 [sunrpc]
Jul 7 00:08:07 host2 kernel: [ 3971.352664] [<ffffffff81567197>] dev_queue_xmit+0x397/0x650
Jul 7 00:08:07 host2 kernel: [ 3971.352672] [<ffffffff81674c65>] ? _raw_read_unlock_bh+0x15/0x20
Jul 7 00:08:07 host2 kernel: [ 3971.352681] [<ffffffffa04f8715>] ? ebt_do_table+0x635/0x6f4 [ebtables]
Jul 7 00:08:07 host2 kernel: [ 3971.352694] [<ffffffffa054fd9f>] br_dev_queue_push_xmit+0x7f/0xd0 [bridge]
Jul 7 00:08:07 host2 kernel: [ 3971.352707] [<ffffffffa05564fa>] br_nf_dev_queue_xmit+0x2a/0x90 [bridge]
Jul 7 00:08:07 host2 kernel: [ 3971.352721] [<ffffffffa0556d73>] br_nf_post_routing+0x223/0x340 [bridge]
Jul 7 00:08:07 host2 kernel: [ 3971.352746] [<ffffffff8158ffa4>] nf_iterate+0x84/0xb0
Jul 7 00:08:07 host2 kernel: [ 3971.352772] [<ffffffffa054fd20>] ? deliver_clone+0x60/0x60 [bridge]
Jul 7 00:08:07 host2 kernel: [ 3971.352796] [<ffff...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.5kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5-rc6-quantal/

tags: added: needs-upstream-testing
David (mardraum)
tags: added: kernel-bug-exists-upstream quantal
removed: needs-upstream-testing
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
David (mardraum) wrote :

Tested with mainline 3.5.0-030500rc7-generic on up-to-date quantal and could reproduce the bug. Log attached.

Revision history for this message
David (mardraum) wrote :

Moving to openvswitch for bridging VMs (not using brcompat) instead of ebtables in a quantal install produces similar warnings attached.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Bane Ivosev (bane-ivosev) wrote :

for me problem is partialy solved. we have several freebsd 9 amd64 guests with virtio network drivers. after changing net drv to e1000 problem disapear.

in the log we found what pid couses a kernel warning and change the net drv in that guest. interesting, several others freebsd guests runing without a problem and all of them created from the same template.

Revision history for this message
penalvch (penalvch) wrote :

Bane Ivosev, did you have any collateral issues with using the WORAKROUND, or preferred net drv over e1000 for your freebsd guests?

description: updated
Changed in linux (Ubuntu):
importance: Medium → Low
status: Triaged → Incomplete
Revision history for this message
penalvch (penalvch) wrote :

Bane Ivosev, this bug report is being closed due to your personal e-mail comments:
>"no we haven't any issues. with virtio you have to disable tso or use e1000 as a driver. both work without any single problem. so with virtio just put in rc.conf:
ifconfig_vtnet0="inet XXX.XXX.XXX.XXX/XX -tso"

 regarding this being fixed with a configuration change. For future reference you can manage the status of your own bugs by clicking on the current status in the yellow line and then choosing a new status in the revealed drop down box. You can learn more about bug statuses at https://wiki.ubuntu.com/Bugs/Status. Thank you again for taking the time to report this bug and helping to make Ubuntu better. Please submit any future bugs you may find.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Bane Ivosev (bane-ivosev) wrote :

working workaroud, thoroughly tested, without any problems:

variant 1:
use e1000 as a kvm network driver

variant 2:
use virtio driver and disable tso for the network card in the guest. for example in rc.conf
ifconfig_vtnet0="inet 192.168.X.X/24 -tso"

both variants work without problems.

Revision history for this message
Thiago Martins (martinx) wrote :

Hi!

I'm facing this problem, it is very easy to reproduce.

1- Host: Ubuntu 12.04.3 (Linux 3.8 or 3.11);
2- KVM (1.5.0 from Ubuntu Cloud Archive, new KVM for LTS);
3- OpenVSwitch 2.0.0 compiled for Ubuntu LTS by me (using `dpkg-buildpackage` on the host itself);
4- Guest: PFSense 2.1 with VirtIO drivers;

Every time you access the PFSense web GUI, just every single click, triggers the following error at host kern.log:

http://paste.ubuntu.com/6800489/

Also, this very same problem happens with Ubuntu 14.04 with Linux 3.13 / KVM 1.7.0.

NOTE: You'll need to manuaaly enable VirtIO for your PFSense, follow it here: https://doc.pfsense.org/index.php/VirtIO_Driver_Support

BTW, all my KVM Guests running behind this bnx2 NIC, have poor network performance. For example:

* From client-x to host: iperf shows ~900Mbits/s
* From client-x to guest-1 (ubuntu): iperf shows ~150Mbits/s
* From client-x to guest-2 (pfsense - e1000): iperf shows ~200Mbits/s
* From client-x to guest-2 (pfsense - virtio): iperf shows ~120MBtis/s

So, I starting to think that the bnx2 drivers for Linux is crap.

What do you guys think?!

Tks!
Thiago

Revision history for this message
Thiago Martins (martinx) wrote :

Guys,

Please, ignore the following content from my previous post:

---
BTW, all my KVM Guests running behind this bnx2 NIC, have poor network performance. For example:

* From client-x to host: iperf shows ~900Mbits/s
* From client-x to guest-1 (ubuntu): iperf shows ~150Mbits/s
* From client-x to guest-2 (pfsense - e1000): iperf shows ~200Mbits/s
* From client-x to guest-2 (pfsense - virtio): iperf shows ~120MBtis/s

So, I starting to think that the bnx2 drivers for Linux is crap.

What do you guys think?!
---

Simple because it is not related to the topic and I'm doing more tests, ubuntu guest is okay now, only pfsense guest still have poor network performance and triggers the "bnx2" error on kern.log at host (every single click on pfsense's webgui, triggers this error on host!). Sorry about the buzz...

Best,
Thiago

Revision history for this message
penalvch (penalvch) wrote :

Thiago Martins, given this bug report is closed, it wouldn't be addressing any problem you may have. However, if you are having a problem in Ubuntu, feel free to file a new report about it via a terminal:
ubuntu-bug linux

Please feel free to subscribe me to it.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.