ixgbe: Kernel Oops when attempting to disable spoofchk in a non-existing VF

Bug #1815501 reported by Heitor Alves de Siqueira
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Trusty
Fix Released
Medium
Heitor Alves de Siqueira

Bug Description

[Impact]
Trusty 3.13 kernel Oops due to ixgbe driver failing to check non-existing VF

[Description]
In the current Trusty kernel, when the ixgbe driver tries to enable or disable spoofchk for a non-existing VF, it causes a kernel oops. This is due to a missing check in ixgbe_ndo_set_vf_spoofchk() before dereferencing the VF. There is an upstream commit to fix this issue (600a507ddcb99 ixgbe: check for vfs outside of sriov_num_vfs before dereference), but it needs to be cherry-picked into 3.13 for Trusty.

Upstream commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=600a507ddcb99

$ git describe --contains 600a507ddcb99
Ubuntu-lts-3.19.0-7.7_14.04.1~1051^2~26^2

$ rmadison linux-generic
=> linux-generic | 3.13.0.24.28 | trusty | amd64, ...
=> linux-generic | 3.13.0.165.175 | trusty-security | amd64, ...
=> linux-generic | 3.13.0.165.175 | trusty-updates | amd64, ...
   linux-generic | 4.4.0.21.22 | xenial | amd64, ...
   linux-generic | 4.15.0.20.23 | bionic | amd64, ...
   linux-generic | 4.18.0.10.11 | cosmic | amd64, ...
   linux-generic | 4.19.0.12.13 | disco | amd64, ...

[Fix]
The fix is to check if the requested VF exists before dereferencing it in the driver. Upstream commit 600a507ddcb99 introduced this check, and it's a clean cherry pick into the latest Trusty kernel.

[Test Case]
1) Deploy a Trusty system with an ixgbe adapter and latest kernel from -updates:
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
# uname -r
3.13.0-165-generic
# lspci -v -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
        Subsystem: Hewlett-Packard Company 561FLR-T 2-port 10Gb Ethernet Adapter
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at 92e00000 (32-bit, prefetchable) [size=2M]
        Memory at 93004000 (32-bit, prefetchable) [size=16K]
        [virtual] Expansion ROM at 93080000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [e0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
        Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [1d0] Access Control Services
        Kernel driver in use: ixgbe

2) Attempt to disable spoofchk with VF -1:
# ip link set dev eth4 vf -1 spoofchk off
Killed
# dmesg
[ 241.066440] BUG: unable to handle kernel paging request at fffffffffffffffa
[ 241.066880] IP: [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
[ 241.067331] PGD 2c13067 PUD 2c15067 PMD 0
[ 241.067591] Oops: 0002 [#1] SMP
[ 241.067793] Modules linked in: ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_crypt gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm hpilo lpc_ich ioatdma ipmi_si shpchp acpi_power_meter mac_hid nls_iso8859_1 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ixgbe tg3 dca ptp pps_core hpsa nvme mdio wmi
[ 241.070462] CPU: 43 PID: 2214 Comm: ip Not tainted 3.13.0-165-generic #215-Ubuntu
[ 241.070908] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
[ 241.071302] task: ffff880035c4c800 ti: ffff8810275c8000 task.ti: ffff8810275c8000
[ 241.071751] RIP: 0010:[<ffffffffa014775c>] [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
[ 241.072349] RSP: 0018:ffff8810275c9858 EFLAGS: 00010283
[ 241.072663] RAX: 0000000000000000 RBX: ffff881022360000 RCX: 00000000ffffffff
[ 241.073090] RDX: 0000000000000000 RSI: 00000000000081fc RDI: ffff881022360000
[ 241.073522] RBP: ffff8810275c9858 R08: fffffffffffffffb R09: ffffffffffffffa8
[ 241.073953] R10: 00000000ffffffa1 R11: 0000000000000246 R12: ffff8810275c9950
[ 241.074381] R13: 0000000000000000 R14: ffffffffa01511c0 R15: 00000000ffffffea
[ 241.074814] FS: 00007f3188ce6740(0000) GS:ffff88203f3e0000(0000) knlGS:0000000000000000
[ 241.075299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 241.075642] CR2: fffffffffffffffa CR3: 000000202662a000 CR4: 0000000000160770
[ 241.076365] Stack:
[ 241.076470] ffff8810275c98f0 ffffffff8164f6fb ffff8810275c9888 ffff882000000010
[ 241.076938] ffffffffa01511c0 ffff8820268b0c24 0000000000000000 0000000000000000
[ 241.077403] 0000000000000000 0000000000000000 ffff8820268b0c28 0000000000000000
[ 241.077865] Call Trace:
[ 241.078037] [<ffffffff8164f6fb>] do_setlink+0x87b/0x9a0
[ 241.078352] [<ffffffff8139c4c6>] ? nla_parse+0xb6/0x120
[ 241.078669] [<ffffffff8164fd1a>] rtnl_newlink+0x3ba/0x620
[ 241.078994] [<ffffffff81161923>] ? __alloc_pages_nodemask+0x1a3/0xb90
[ 241.079393] [<ffffffff812e5f7e>] ? security_capable+0x1e/0x20
[ 241.079733] [<ffffffff81077e29>] ? ns_capable+0x29/0x50
[ 241.080027] [<ffffffff8164c718>] rtnetlink_rcv_msg+0x98/0x250
[ 241.080381] [<ffffffff811afc88>] ? __kmalloc_node_track_caller+0x58/0x2b0
[ 241.080802] [<ffffffff8162c98e>] ? __alloc_skb+0x7e/0x2b0
[ 241.081121] [<ffffffff8164c680>] ? rtnetlink_rcv+0x30/0x30
[ 241.081464] [<ffffffff8166b0ab>] netlink_rcv_skb+0xab/0xc0
[ 241.081792] [<ffffffff8164c678>] rtnetlink_rcv+0x28/0x30
[ 241.082118] [<ffffffff8166a7a0>] netlink_unicast+0xe0/0x1b0
[ 241.082455] [<ffffffff8166ab7e>] netlink_sendmsg+0x30e/0x680
[ 241.082801] [<ffffffff81623251>] sock_sendmsg+0x91/0xc0
[ 241.099175] [<ffffffff811bd526>] ? __mem_cgroup_commit_charge+0x156/0x3d0
[ 241.115633] [<ffffffff81622f2e>] ? move_addr_to_kernel.part.14+0x1e/0x60
[ 241.132150] [<ffffffff81623d81>] ? move_addr_to_kernel+0x21/0x30
[ 241.148222] [<ffffffff81623659>] ___sys_sendmsg+0x389/0x3a0
[ 241.163907] [<ffffffff81621ecf>] ? sock_destroy_inode+0x2f/0x40
[ 241.179322] [<ffffffff817487d4>] ? __do_page_fault+0x214/0x570
[ 241.194220] [<ffffffff811dffdd>] ? dput+0xad/0x190
[ 241.209158] [<ffffffff811e94e4>] ? mntput+0x24/0x40
[ 241.223576] [<ffffffff811cab81>] ? __fput+0x181/0x260
[ 241.237701] [<ffffffff81624462>] __sys_sendmsg+0x42/0x80
[ 241.251846] [<ffffffff816244b2>] SyS_sendmsg+0x12/0x20
[ 241.265579] [<ffffffff8174d3bc>] system_call_fastpath+0x26/0x2b
[ 241.278978] Code: 8d 0c 06 83 e1 07 29 c1 48 63 c6 c1 fe 03 4c 8d 04 80 8d 34 b5 00 82 00 00 4e 8d 0c 40 48 8b 87 90 85 00 00 48 63 f6 49 c1 e1 03 <42> 88 54 08 52 48 89 f0 48 03 87 80 16 00 00 8b 00 41 ba 01 00
[ 241.306211] RIP [<ffffffffa014775c>] ixgbe_ndo_set_vf_spoofchk+0x3c/0xc0 [ixgbe]
[ 241.319310] RSP <ffff8810275c9858>
[ 241.332404] CR2: fffffffffffffffa
[ 241.345011] ---[ end trace a45b72690a7e13be ]---

[Regression Potential]
The regression potential is low, since the fix is a simple check to confirm the VF exists before doing any operations. This check is already implemented in other functions of the ixgbe driver, and only the spoofchk function is missing it. Nonetheless, the patch was tested in an impacted system and confirmed to resolve the kernel oops without further problems.

Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
Changed in linux (Ubuntu):
status: New → Incomplete
status: Incomplete → New
status: New → Fix Released
assignee: Heitor R. Alves de Siqueira (halves) → nobody
Eric Desrochers (slashd)
Changed in linux (Ubuntu Trusty):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Heitor R. Alves de Siqueira (halves)
Eric Desrochers (slashd)
tags: added: sts
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'. If the problem still exists, change the tag 'verification-needed-trusty' to 'verification-failed-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Tested on linux-generic 3.13.0.168.179 from trusty-proposed. Tried reproducing according to test case from description, and verified that ixgbe is fixed:

# uname -r
3.13.0-168-generic

# apt-cache madison linux-generic
linux-generic | 3.13.0.168.179 | http://archive.ubuntu.com/ubuntu/trusty-proposed/main amd64 Packages

# ip link set dev eth4 vf -1 spoofchk off
RTNETLINK answers: Invalid argument

dmesg is clean after spoofchk, and network connectivity continues operating as expected.

tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.13.0-168.218

---------------
linux (3.13.0-168.218) trusty; urgency=medium

  * linux: 3.13.0-168.218 -proposed tracker (LP: #1819663)

  * CVE-2019-9213
    - mm: enforce min addr even if capable() in expand_downwards()

  * CVE-2019-3460
    - Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt

  * CVE-2017-1000410
    - Bluetooth: Prevent stack info leak from the EFS element.

  * ixgbe: Kernel Oops when attempting to disable spoofchk in a non-existing VF
    (LP: #1815501)
    - ixgbe: check for vfs outside of sriov_num_vfs before dereference

  * CVE-2018-19824
    - ALSA: usb-audio: Fix UAF decrement if card has no live interfaces in card.c

  * CVE-2019-3459
    - Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer

  * CVE-2019-7222
    - KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)

  * CVE-2019-6974
    - kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)

  * CVE-2017-18360
    - USB: serial: io_ti: fix div-by-zero in set_termios

 -- Stefan Bader <email address hidden> Thu, 14 Mar 2019 14:44:53 +0100

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.