i40e: Setting VF MAC address causes General Protection Fault

Bug #1852432 reported by Heitor Alves de Siqueira
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Heitor Alves de Siqueira
Bionic
Fix Released
High
Heitor Alves de Siqueira
Disco
Fix Released
High
Heitor Alves de Siqueira
Eoan
Fix Released
High
Heitor Alves de Siqueira
Focal
Fix Released
High
Heitor Alves de Siqueira

Bug Description

[Impact]
 * Creating SR-IOV enabled VMs in Openstack can sometimes trigger the GPF and leave system unusable

[Test Case]
 * Continuously spin up VFs and set MAC address with e.g. ifconfig

[Fix]
 * The fix updates the VSI pointer passed down to i40e_set_vf_mac function() if the adapter is still in reset, preventing the GPF.

[Regression Potential]
 * Regression potential should be low, as we're now updating the VSI using the ID stored in the VF pointer
 * Regressions could arise from issues in VF creation or reset, as that would corrupt the new VSI pointer
 * Patch was validated and tested in a production environment

description: updated
Changed in linux (Ubuntu Eoan):
status: New → Confirmed
Changed in linux (Ubuntu Disco):
status: New → Confirmed
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Changed in linux (Ubuntu Xenial):
status: New → Confirmed
Changed in linux (Ubuntu Eoan):
importance: Undecided → High
Changed in linux (Ubuntu Disco):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu Eoan):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Disco):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Heitor Alves de Siqueira (halves)
no longer affects: linux (Ubuntu Xenial)
description: updated
Changed in linux (Ubuntu Bionic):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Disco):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Eoan):
status: Confirmed → In Progress
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco
tags: added: verification-needed-bionic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
Revision history for this message
Nobuto Murata (nobuto) wrote :

The general protection fault was reproducible with the current 5.3 kernel as follows by creating 10 SR-IOV sequentially. After updating it to the -proposed one as
 linux-image-5.3.0-25-generic 5.3.0-25.27, there is no such general protection fault happened with the same operations. So we can consider the fix is verified with the eoan-proposed kernel.

tags: added: verification-done-eoan
removed: verification-needed-eoan
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
Download full text (4.7 KiB)

Verified on Bionic with the following test case:
$ echo 64 | sudo tee /sys/class/net/ens1f0/device/sriov_numvfs && virsh attach-interface valuable-bluefish hostdev 0000:08:02.6 --managed --live

On 4.15.0-72.81, I get the following GPF:
[119044.656412] general protection fault: 0000 [#1] SMP PTI
[119044.656455] Modules linked in: i40evf vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost tap ebtable_filter ebtables devlink ip6t
able_filter ip6_tables kvm_intel binfmt_misc ipt_REJECT nf_reject_ipv4 xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_CHECKSUM xt_comm
ent xt_tcpudp iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter bridge stp llc dummy
 ixgbevf ipmi_ssif nls_iso8859_1 intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass intel_cstate intel_rapl_perf
 lpc_ich hpilo ioatdma shpchp ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_power_meter sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10
[119044.656929] raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag2
00 i2c_algo_bit crct10dif_pclmul crc32_pclmul ttm ghash_clmulni_intel ses pcbc enclosure drm_kms_helper aesni_intel syscopyarea aes_x86_64 cr
ypto_simd sysfillrect glue_helper cryptd sysimgblt ixgbe fb_sys_fops i40e tg3 dca drm nvme ptp hpsa pps_core nvme_core mdio scsi_transport_sa
s wmi [last unloaded: kvm_intel]
[119044.657204] CPU: 11 PID: 10300 Comm: libvirtd Not tainted 4.15.0-72-generic #81-Ubuntu
[119044.657255] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
[119044.657310] RIP: 0010:i40e_sync_vsi_filters+0x95/0xd00 [i40e]
[119044.657349] RSP: 0018:ffffbcc5479fb718 EFLAGS: 00010202
[119044.657385] RAX: 77b7ed74a738c96d RBX: ffff9aad58382000 RCX: 0000000000000000
[119044.657431] RDX: 0000000000000001 RSI: 00000000fffffe01 RDI: ffff9aad58382000
[119044.657477] RBP: ffffbcc5479fb7b8 R08: 0000000000000000 R09: 0000000000003494
[119044.657523] R10: 0000000000000000 R11: 0000d9714ff8ecd4 R12: ffff9aad58382000
[119044.657569] R13: ffff9aad384662a0 R14: ffff9aad58382a28 R15: ffff9aad67b1643c
[119044.657616] FS: 00007fd7aef27700(0000) GS:ffff9aad7f0c0000(0000) knlGS:0000000000000000
[119044.657668] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[119044.657706] CR2: 00007fd79001c058 CR3: 0000001029f30001 CR4: 00000000001626e0
[119044.657752] Call Trace:
[119044.657778] ? del_timer_sync+0x45/0x50
[119044.657809] ? __next_timer_interrupt+0xe0/0xe0
[119044.657851] i40e_ndo_set_vf_mac+0x109/0x2b0 [i40e]
[119044.657890] do_setlink+0x8a5/0xed0
[119044.657919] ? kmalloc_large_node+0x3b/0x60
[119044.657951] ? security_sock_rcv_skb+0x41/0x60
[119044.657986] rtnl_setlink+0xdc/0x130
[119044.658017] rtnetlink_rcv_msg+0x221/0x2b0
[119044.658049] ? aa_label_sk_perm+0x129/0x140
[119044.658081] ? _cond_resched+0x19/0x40
[119044.658110] ? rtnl_calcit.isra.25+0x110/0x110
[119044.658142] netlink_rcv_skb+0x54/0x130
[119044.658172] rtnetlink_rcv+0x15/0x20
[119044.658198] netlink_unicast+0x19e/0x240
[11904...

Read more...

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Verified on disco with the following test case:

$ uname -r
5.0.0-38-generic
$ echo 64 | sudo tee /sys/class/net/ens1f0/device/sriov_numvfs && virsh attach-interface iov hostdev 0000:08:02.6 --managed --live
64
Interface attached successfully

tags: added: verification-done-disco
removed: verification-needed-disco
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (27.4 KiB)

This bug was fixed in the package linux - 5.3.0-26.28

---------------
linux (5.3.0-26.28) eoan; urgency=medium

  * eoan/linux: 5.3.0-26.28 -proposed tracker (LP: #1856807)

  * nvidia-435 is in eoan, linux-restricted-modules only builds against 430,
    ubiquity gives me the self-signed modules experience instead of using the
    Canonical-signed modules (LP: #1856407)
    - Add nvidia-435 dkms build

linux (5.3.0-25.27) eoan; urgency=medium

  * eoan/linux: 5.3.0-25.27 -proposed tracker (LP: #1854762)

  * CVE-2019-14901
    - SAUCE: mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()

  * CVE-2019-14896 // CVE-2019-14897
    - SAUCE: libertas: Fix two buffer overflows at parsing bss descriptor

  * CVE-2019-14895
    - SAUCE: mwifiex: fix possible heap overflow in mwifiex_process_country_ie()

  * [CML] New device id's for CMP-H (LP: #1846335)
    - mmc: sdhci-pci: Add another Id for Intel CML
    - i2c: i801: Add support for Intel Comet Lake PCH-H
    - mtd: spi-nor: intel-spi: Add support for Intel Comet Lake-H SPI serial flash
    - mfd: intel-lpss: Add Intel Comet Lake PCH-H PCI IDs

  * i915: Display flickers (monitor loses signal briefly) during "flickerfree"
    boot, while showing the BIOS logo on a black background (LP: #1836858)
    - [Config] FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER=y

  * Please add patch fixing RK818 ID detection (LP: #1853192)
    - SAUCE: mfd: rk808: Fix RK818 ID template

  * Kernel build log filled with "/bin/bash: line 5: warning: command
    substitution: ignored null byte in input" (LP: #1853843)
    - [Debian] Fix warnings when checking for modules signatures

  * Lenovo dock MAC Address pass through doesn't work in Ubuntu (LP: #1827961)
    - r8152: Add macpassthru support for ThinkPad Thunderbolt 3 Dock Gen 2

  * Dell XPS 13 9350/9360 headphone audio hiss (LP: #1654448) // [XPS 13 9360,
    Realtek ALC3246, Black Headphone Out, Front] High noise floor (LP: #1845810)
    - ALSA: hda/realtek: Reduce the Headphone static noise on XPS 9350/9360

  * no HDMI video output since GDM greeter after linux-oem-osp1 version
    5.0.0-1026 (LP: #1852386)
    - drm/i915: Add new CNL PCH ID seen on a CML platform
    - SAUCE: drm/i915: Fix detection for a CMP-V PCH

  * [broadwell-rt286, playback] Since Linux 5.2rc2 audio playback no longer
    works on Dell Venue 11 Pro 7140 (LP: #1846539)
    - [Config] Drop snd-sof-intel-bdw build
    - SAUCE: ASoC: SOF: Intel: Broadwell: clarify mutual exclusion with legacy
      driver

  * [CML-S62] Need enable turbostat patch support for Comet lake- S 6+2
    (LP: #1847451)
    - SAUCE: tools/power turbostat: Add Cometlake support

  * External microphone can't work on some dell machines with the codec alc256
    or alc236 (LP: #1853791)
    - SAUCE: ALSA: hda/realtek - Move some alc256 pintbls to fallback table
    - SAUCE: ALSA: hda/realtek - Move some alc236 pintbls to fallback table

  * Memory leak in net/xfrm/xfrm_state.c - 8 pages per ipsec connection
    (LP: #1853197)
    - xfrm: Fix memleak on xfrm state destroy

  * CVE-2019-18660: patches for Ubuntu (LP: #1853142) // CVE-2019-18660
    - powerpc/64s: support nospectre_v2 cmdline option
    - powerp...

Changed in linux (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (42.3 KiB)

This bug was fixed in the package linux - 5.0.0-38.41

---------------
linux (5.0.0-38.41) disco; urgency=medium

  * disco/linux: 5.0.0-38.41 -proposed tracker (LP: #1854788)

  * [Regression] Failed to boot disco kernel built from master-next (kernel
    kernel NULL pointer dereference) (LP: #1853981)
    - SAUCE: blk-mq: Fix blk_mq_make_request for mq devices

  * CVE-2019-14901
    - SAUCE: mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()

  * CVE-2019-14896 // CVE-2019-14897
    - SAUCE: libertas: Fix two buffer overflows at parsing bss descriptor

  * CVE-2019-14895
    - SAUCE: mwifiex: fix possible heap overflow in mwifiex_process_country_ie()

  * [CML] New device id's for CMP-H (LP: #1846335)
    - mmc: sdhci-pci: Add another Id for Intel CML
    - i2c: i801: Add support for Intel Comet Lake PCH-H
    - mtd: spi-nor: intel-spi: Add support for Intel Comet Lake-H SPI serial flash
    - mfd: intel-lpss: Add Intel Comet Lake PCH-H PCI IDs

  * Please add patch fixing RK818 ID detection (LP: #1853192)
    - SAUCE: mfd: rk808: Fix RK818 ID template

  * [SRU][B/OEM-B/OEM-OSP1/D] Enable new Elan touchpads which are not in current
    whitelist (LP: #1853246)
    - Input: elan_i2c - export the device id whitelist
    - HID: quirks: Refactor ELAN 400 and 401 handling

  * Lenovo dock MAC Address pass through doesn't work in Ubuntu (LP: #1827961)
    - r8152: Add macpassthru support for ThinkPad Thunderbolt 3 Dock Gen 2

  * [CML-S62] Need enable turbostat patch support for Comet lake- S 6+2
    (LP: #1847451)
    - SAUCE: tools/power turbostat: Add Cometlake support

  * External microphone can't work on some dell machines with the codec alc256
    or alc236 (LP: #1853791)
    - SAUCE: ALSA: hda/realtek - Move some alc256 pintbls to fallback table
    - SAUCE: ALSA: hda/realtek - Move some alc236 pintbls to fallback table

  * Memory leak in net/xfrm/xfrm_state.c - 8 pages per ipsec connection
    (LP: #1853197)
    - xfrm: Fix memleak on xfrm state destroy

  * CVE-2019-18660: patches for Ubuntu (LP: #1853142) // CVE-2019-18660
    - powerpc/64s: support nospectre_v2 cmdline option
    - powerpc/book3s64: Fix link stack flush on context switch
    - KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel

  * Raydium Touchscreen on ThinkPad L390 does not work (LP: #1849721)
    - HID: i2c-hid: fix no irq after reset on raydium 3118

  * Make Goodix I2C touchpads work (LP: #1853842)
    - HID: i2c-hid: Remove runtime power management
    - HID: i2c-hid: Send power-on command after reset

  * Touchpad doesn't work on Dell Inspiron 7000 2-in-1 (LP: #1851901)
    - Revert "UBUNTU: SAUCE: mfd: intel-lpss: add quirk for Dell XPS 13 7390
      2-in-1"
    - lib: devres: add a helper function for ioremap_uc
    - mfd: intel-lpss: Use devm_ioremap_uc for MMIO

  * CVE-2019-19055
    - nl80211: fix memory leak in nl80211_get_ftm_responder_stats

  * [CML-S62] Need enable intel_rapl patch support for Comet lake- S 6+2
    (LP: #1847454)
    - powercap/intel_rapl: add support for CometLake Mobile
    - powercap/intel_rapl: add support for Cometlake desktop

  * [CML-S62] Need enable intel_pmc_core driver patch for Comet l...

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (42.4 KiB)

This bug was fixed in the package linux - 4.15.0-74.84

---------------
linux (4.15.0-74.84) bionic; urgency=medium

  * bionic/linux: 4.15.0-74.84 -proposed tracker (LP: #1856749)

  * [Hyper-V] KVP daemon fails to start on first boot of disco VM (LP: #1820063)
    - [Packaging] bind hv_kvp_daemon startup to hv_kvp device

  * Unrevert "arm64: Use firmware to detect CPUs that are not affected by
    Spectre-v2" (LP: #1854207)
    - arm64: Get rid of __smccc_workaround_1_hvc_*
    - arm64: Use firmware to detect CPUs that are not affected by Spectre-v2

  * Bionic kernel panic on Cavium ThunderX CN88XX (LP: #1853485)
    - SAUCE: irqchip/gic-v3-its: Add missing return value in
      its_irq_domain_activate()

linux (4.15.0-73.82) bionic; urgency=medium

  * bionic/linux: 4.15.0-73.82 -proposed tracker (LP: #1854819)

  * CVE-2019-14901
    - SAUCE: mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()

  * CVE-2019-14896 // CVE-2019-14897
    - SAUCE: libertas: Fix two buffer overflows at parsing bss descriptor

  * CVE-2019-14895
    - SAUCE: mwifiex: fix possible heap overflow in mwifiex_process_country_ie()

  * CVE-2019-18660: patches for Ubuntu (LP: #1853142) // CVE-2019-18660
    - powerpc/64s: support nospectre_v2 cmdline option
    - powerpc/book3s64: Fix link stack flush on context switch
    - KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel

  * Please add patch fixing RK818 ID detection (LP: #1853192)
    - SAUCE: mfd: rk808: Fix RK818 ID template

  * [SRU][B/OEM-B/OEM-OSP1/D] Enable new Elan touchpads which are not in current
    whitelist (LP: #1853246)
    - HID: quirks: Fix keyboard + touchpad on Lenovo Miix 630
    - Input: elan_i2c - export the device id whitelist
    - HID: quirks: Refactor ELAN 400 and 401 handling

  * Lenovo dock MAC Address pass through doesn't work in Ubuntu (LP: #1827961)
    - r8152: Add macpassthru support for ThinkPad Thunderbolt 3 Dock Gen 2

  * s390/dasd: reduce the default queue depth and nr of hardware queues
    (LP: #1852257)
    - s390/dasd: reduce the default queue depth and nr of hardware queues

  * External microphone can't work on some dell machines with the codec alc256
    or alc236 (LP: #1853791)
    - SAUCE: ALSA: hda/realtek - Move some alc256 pintbls to fallback table
    - SAUCE: ALSA: hda/realtek - Move some alc236 pintbls to fallback table

  * Memory leak in net/xfrm/xfrm_state.c - 8 pages per ipsec connection
    (LP: #1853197)
    - xfrm: Fix memleak on xfrm state destroy

  * CVE-2019-19083
    - drm/amd/display: memory leak

  * update ENA driver for DIMLIB dynamic interrupt moderation (LP: #1853180)
    - net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it
    - net: ena: switch to dim algorithm for rx adaptive interrupt moderation
    - net: ena: reimplement set/get_coalesce()
    - net: ena: enable the interrupt_moderation in driver_supported_features
    - net: ena: remove code duplication in
      ena_com_update_nonadaptive_moderation_interval _*()
    - net: ena: remove old adaptive interrupt moderation code from ena_netdev
    - net: ena: remove ena_restore_ethtool_params() and relevant fields
    - net: ena: remov...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux (Ubuntu Focal):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.