Panic on 3.13.0-24 with bnx2, iptables and MASQUERADE

Bug #1313591 reported by Endre Karlson
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

I get a panic when I use MASQUERADE on a subnet towards a nic that's bnx2.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-24-generic 3.13.0-24.46
ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
Uname: Linux 3.13.0-24-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Apr 28 11:15 seq
 crw-rw---- 1 root audio 116, 33 Apr 28 11:15 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Date: Mon Apr 28 11:27:29 2014
HibernationDevice: RESUME=/dev/mapper/vg_root-lv_swap
InstallationDate: Installed on 2014-03-28 (30 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64+mac (20140222)
MachineType: HP ProLiant DL380 G7
PciMultimedia:

ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/vg_root-lv_root ro biosdevname=0
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-24-generic N/A
 linux-backports-modules-3.13.0-24-generic N/A
 linux-firmware 1.127
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 05/05/2011
dmi.bios.vendor: HP
dmi.bios.version: P67
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr:
dmi.product.name: ProLiant DL380 G7
dmi.sys.vendor: HP

Revision history for this message
Endre Karlson (endre-karlson) wrote :
Revision history for this message
Endre Karlson (endre-karlson) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Endre Karlson (endre-karlson) wrote :
Revision history for this message
Endre Karlson (endre-karlson) wrote :

Note that this is fixed in 3.14.1 by the looks. Tested and it works ok.

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: performing-bisect
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Endre,

Thanks for taking the time to report this bug. I can assist you with a reverse bisect to identify the commit that fixes this in 3.14.1.

The current Trusty kernel you are running is based on upstream 3.13.9. However, the 3.13.11 kernel is not also available. Would it be possible for you you test 3.13.11 to see if this bug is already fixed there? It can be downloaded from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13.11-trusty/

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
jason bishop (jason-bishop) wrote :
Download full text (8.9 KiB)

I'm also experiencing this crash. I would be glad to provide any info.

I tested with your kernel without seeing any obvious difference in behavior. machine is an old dell 1950 with bnx2 onboard and a bnx2 add-in pcie card. i'm not sure but i think its exploding when a packet comes from external to a tenant VM via GRE tunnels. This would be receive on eth3 and send out on eth1. i've included ethtool output below.

openstack-neutron:/root# uname -a
Linux openstack-neutron.stanford.edu 3.13.11-031311-generic #201404222035 SMP Wed Apr 23 00:36:02 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

openstack-neutron login: [ 236.982433] ------------[ cut here ]------------
[ 236.986375] kernel BUG at /home/apw/COD/linux/net/core/skbuff.c:2903!

[ 236.996021] invalid opcode: 0000 [#1] SMP
[ 236.996021] Modules linked in: xt_nat xt_conntrack ip6table_filter ip6_tables iptable_filter xt_REDIRECT xt__
[ 236.996021] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.11-031311-generic #201404222035
[ 236.996021] Hardware name: Dell Inc. PowerEdge 1950/0TT740, BIOS 2.5.0 09/12/2008
[ 236.996021] task: ffff8802341b97f0 ti: ffff8802341b4000 task.ti: ffff8802341b4000
[ 236.996021] RIP: 0010:[<ffffffff8162de14>] [<ffffffff8162de14>] skb_segment+0x8a4/0x8c0
[ 236.996021] RSP: 0018:ffff88023fc432e8 EFLAGS: 00010202
[ 236.996021] RAX: 0000000000000000 RBX: ffff880232dcba00 RCX: 0000000000000050
[ 236.996021] RDX: ffff88022f88c4f0 RSI: ffff88022f88c400 RDI: ffff880232dcab00
[ 236.996021] RBP: ffff88023fc433b8 R08: 0000000000000042 R09: 0000000000000050
[ 236.996021] R10: 00000000000005b8 R11: 0000000000000000 R12: ffff88022f88c8f0
[ 236.996021] R13: 0000000000000000 R14: ffff880232dcb200 R15: ffff880232dcab00
[ 236.996021] FS: 0000000000000000(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000
[ 236.996021] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 236.996021] CR2: 0000000001c7c170 CR3: 0000000232eec000 CR4: 00000000000007e0
[ 236.996021] Stack:
[ 236.996021] 0000000000000000 0000000000000000 0000000000000000 ffff88022f88c894
[ 236.996021] ffff88022f88c400 000000000000000e 00000000000005b8 0000000100000000
[ 236.996021] 0000000000000000 ffffffffffffffbe ffffffffffffffd6 000000000000006c
[ 236.996021] Call Trace:
[ 236.996021] <IRQ>
[ 236.996021] [<ffffffff8169b6da>] tcp_gso_segment.part.7+0x11a/0x3c0
[ 236.996021] [<ffffffff8169b9b1>] tcp_gso_segment+0x31/0x60
[ 236.996021] [<ffffffff816ab5c5>] inet_gso_segment+0x135/0x370
[ 236.996021] [<ffffffff8163c5fe>] skb_mac_gso_segment+0xae/0x180
[ 236.996021] [<ffffffffa0085930>] gre_gso_segment+0x130/0x370 [gre]
[ 236.996021] [<ffffffff816ab5c5>] inet_gso_segment+0x135/0x370
[ 236.996021] [<ffffffff8163c5fe>] skb_mac_gso_segment+0xae/0x180
[ 236.996021] [<ffffffff8163c72e>] __skb_gso_segment+0x5e/0xc0
[ 236.996021] [<ffffffff8163c919>] dev_hard_start_xmit+0x189/0x5a0
[ 236.996021] [<ffffffff8165bfce>] sch_direct_xmit+0xfe/0x1d0
[ 236.996021] [<ffffffff8163cea8>] __dev_queue_xmit+0x178/0x4b0
[ 236.996021] [<ffffffff81677080>] ? __ip_append_data.isra.40+0x9d0/0x9d0
[ 236.996021] [<ffffffff8163d200>] dev_queue_xmit+0x10/0x20
[ 236.996021] [<ffffffff81677...

Read more...

Revision history for this message
jason bishop (jason-bishop) wrote :

adding dump as attachment

Revision history for this message
jason bishop (jason-bishop) wrote :

one more bit of info. my crash goes away after running these 2:

ethtool -K eth3 gro off
ethtool -K eth3 gso off

openstack-neutron:/root# ethtool -i eth3
driver: bnx2
version: 2.2.4
firmware-version: 6.4.5 bc 5.2.3
bus-info: 0000:0e:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

Revision history for this message
Endre Karlson (endre-karlson) wrote :

Afaik I've tested with the kernel as requested and it seems to happen there as well. Also I tested using VXLAN and the same or alike panics seems to happen there too .

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you also test the 3.13.11.3 kernel? If it is not fixed there, we can perform a reverse bisect to find the fix in 3.14.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13.11.3-trusty/

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.