ixgbe: Kernel Oops when attempting to disable spoofchk in a non-existing VF

Bug #1815501 reported by Heitor R. Alves de Siqueira on 2019-02-11
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Trusty
Medium
Heitor R. Alves de Siqueira

Bug Description

[Impact]
Trusty 3.13 kernel Oops due to ixgbe driver failing to check non-existing VF

[Description]
In the current Trusty kernel, when the ixgbe driver tries to enable or disable spoofchk for a non-existing VF, it causes a kernel oops. This is due to a missing check in ixgbe_ndo_set_vf_spoofchk() before dereferencing the VF. There is an upstream commit to fix this issue (600a507ddcb99 ixgbe: check for vfs outside of sriov_num_vfs before dereference), but it needs to be cherry-picked into 3.13 for Trusty.

Upstream commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=600a507ddcb99

$ git describe --contains 600a507ddcb99
Ubuntu-lts-3.19.0-7.7_14.04.1~1051^2~26^2

$ rmadison linux-generic
=> linux-generic | 3.13.0.24.28 | trusty | amd64, ...
=> linux-generic | 3.13.0.165.175 | trusty-security | amd64, ...
=> linux-generic | 3.13.0.165.175 | trusty-updates | amd64, ...
   linux-generic | 4.4.0.21.22 | xenial | amd64, ...
   linux-generic | 4.15.0.20.23 | bionic | amd64, ...
   linux-generic | 4.18.0.10.11 | cosmic | amd64, ...
   linux-generic | 4.19.0.12.13 | disco | amd64, ...

[Fix]
The fix is to check if the requested VF exists before dereferencing it in the driver. Upstream commit 600a507ddcb99 introduced this check, and it's a clean cherry pick into the latest Trusty kernel.

[Test Case]
1) Deploy a Trusty system with an ixgbe adapter and latest kernel from -updates:
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
# uname -r
3.13.0-165-generic
# lspci -v -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
        Subsystem: Hewlett-Packard Company 561FLR-T 2-port 10Gb Ethernet Adapter
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at 92e00000 (32-bit, prefetchable) [size=2M]
        Memory at 93004000 (32-bit, prefetchable) [size=16K]
        [virtual] Expansion ROM at 93080000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [e0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [1d0] Access Control Services
        Kernel driver in use: ixgbe

2) Attempt to disable spoofchk with VF -1:
# ip link set dev eth4 vf -1 spoofchk off
Killed
# dmesg
[ 241.066440] BUG: unable to handle kernel paging request at fffffffffffffffa
[ 241.066880] IP: [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
[ 241.067331] PGD 2c13067 PUD 2c15067 PMD 0
[ 241.067591] Oops: 0002 [#1] SMP
[ 241.067793] Modules linked in: ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm hpilo lpc_ich ioatdma ipmi_si shpchp acpi_power_meter mac_hid nls_iso8859_1 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ixgbe tg3 dca ptp pps_core hpsa nvme mdio wmi
[ 241.070462] CPU: 43 PID: 2214 Comm: ip Not tainted 3.13.0-165-generic #215-Ubuntu
[ 241.070908] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
[ 241.071302] task: ffff880035c4c800 ti: ffff8810275c8000 task.ti: ffff8810275c8000
[ 241.071751] RIP: 0010:[<ffffffffa014775c>] [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
[ 241.072349] RSP: 0018:ffff8810275c9858 EFLAGS: 00010283
[ 241.072663] RAX: 0000000000000000 RBX: ffff881022360000 RCX: 00000000ffffffff
[ 241.073090] RDX: 0000000000000000 RSI: 00000000000081fc RDI: ffff881022360000
[ 241.073522] RBP: ffff8810275c9858 R08: fffffffffffffffb R09: ffffffffffffffa8
[ 241.073953] R10: 00000000ffffffa1 R11: 0000000000000246 R12: ffff8810275c9950
[ 241.074381] R13: 0000000000000000 R14: ffffffffa01511c0 R15: 00000000ffffffea
[ 241.074814] FS: 00007f3188ce6740(0000) GS:ffff88203f3e0000(0000) knlGS:0000000000000000
[ 241.075299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 241.075642] CR2: fffffffffffffffa CR3: 000000202662a000 CR4: 0000000000160770
[ 241.076365] Stack:
[ 241.076470] ffff8810275c98f0 ffffffff8164f6fb ffff8810275c9888 ffff882000000010
[ 241.076938] ffffffffa01511c0 ffff8820268b0c24 0000000000000000 0000000000000000
[ 241.077403] 0000000000000000 0000000000000000 ffff8820268b0c28 0000000000000000
[ 241.077865] Call Trace:
[ 241.078037] [<ffffffff8164f6fb>] do_setlink+0x87b/0x9a0
[ 241.078352] [<ffffffff8139c4c6>] ? nla_parse+0xb6/0x120
[ 241.078669] [<ffffffff8164fd1a>] rtnl_newlink+0x3ba/0x620
[ 241.078994] [<ffffffff81161923>] ? __alloc_pages_nodemask+0x1a3/0xb90
[ 241.079393] [<ffffffff812e5f7e>] ? security_capable+0x1e/0x20
[ 241.079733] [<ffffffff81077e29>] ? ns_capable+0x29/0x50
[ 241.080027] [<ffffffff8164c718>] rtnetlink_rcv_msg+0x98/0x250
[ 241.080381] [<ffffffff811afc88>] ? __kmalloc_node_track_caller+0x58/0x2b0
[ 241.080802] [<ffffffff8162c98e>] ? __alloc_skb+0x7e/0x2b0
[ 241.081121] [<ffffffff8164c680>] ? rtnetlink_rcv+0x30/0x30
[ 241.081464] [<ffffffff8166b0ab>] netlink_rcv_skb+0xab/0xc0
[ 241.081792] [<ffffffff8164c678>] rtnetlink_rcv+0x28/0x30
[ 241.082118] [<ffffffff8166a7a0>] netlink_unicast+0xe0/0x1b0
[ 241.082455] [<ffffffff8166ab7e>] netlink_sendmsg+0x30e/0x680
[ 241.082801] [<ffffffff81623251>] sock_sendmsg+0x91/0xc0
[ 241.099175] [<ffffffff811bd526>] ? __mem_cgroup_commit_charge+0x156/0x3d0
[ 241.115633] [<ffffffff81622f2e>] ? move_addr_to_kernel.part.14+0x1e/0x60
[ 241.132150] [<ffffffff81623d81>] ? move_addr_to_kernel+0x21/0x30
[ 241.148222] [<ffffffff81623659>] ___sys_sendmsg+0x389/0x3a0
[ 241.163907] [<ffffffff81621ecf>] ? sock_destroy_inode+0x2f/0x40
[ 241.179322] [<ffffffff817487d4>] ? __do_page_fault+0x214/0x570
[ 241.194220] [<ffffffff811dffdd>] ? dput+0xad/0x190
[ 241.209158] [<ffffffff811e94e4>] ? mntput+0x24/0x40
[ 241.223576] [<ffffffff811cab81>] ? __fput+0x181/0x260
[ 241.237701] [<ffffffff81624462>] __sys_sendmsg+0x42/0x80
[ 241.251846] [<ffffffff816244b2>] SyS_sendmsg+0x12/0x20
[ 241.265579] [<ffffffff8174d3bc>] system_call_fastpath+0x26/0x2b
[ 241.278978] Code: 8d 0c 06 83 e1 07 29 c1 48 63 c6 c1 fe 03 4c 8d 04 80 8d 34 b5 00 82 00 00 4e 8d 0c 40 48 8b 87 90 85 00 00 48 63 f6 49 c1 e1 03 <42> 88 54 08 52 48 89 f0 48 03 87 80 16 00 00 8b 00 41 ba 01 00
[ 241.306211] RIP [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
[ 241.319310] RSP <ffff8810275c9858>
[ 241.332404] CR2: fffffffffffffffa
[ 241.345011] ---[ end trace a45b72690a7e13be ]---

[Regression Potential]
The regression potential is low, since the fix is a simple check to confirm the VF exists before doing any operations. This check is already implemented in other functions of the ixgbe driver, and only the spoofchk function is missing it. Nonetheless, the patch was tested in an impacted system and confirmed to resolve the kernel oops without further problems.

Tags: sts Edit Tag help
Changed in linux (Ubuntu):
status: New → Incomplete
status: Incomplete → New
status: New → Fix Released
assignee: Heitor R. Alves de Siqueira (halves) → nobody
Eric Desrochers (slashd) on 2019-02-11
Changed in linux (Ubuntu Trusty):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Heitor R. Alves de Siqueira (halves)
Eric Desrochers (slashd) on 2019-02-11
tags: added: sts
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers