i40e: general protection fault in i40e_config_vf_promiscuous_mode

Bug #1852663 reported by gerald.yang on 2019-11-15
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
gerald.yang
Eoan
Undecided
gerald.yang

Bug Description

SRU Justification

[Impact]
Assign some VFs to VMs, when deleting VMs, a general protection fault occurs in i40e_config_vf_promiscuous_mode

general protection fault: 0000 [#1] SMP PTI
CPU: 54 PID: 6200 Comm: libvirtd Not tainted 5.3.0-21-generic #22~18.04.1-UbuntuHardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 05/21/2019
RIP: 0010:i40e_config_vf_promiscuous_mode+0x172/0x330 [i40e]
Code: 48 8b 00 83 d1 00 48 85 c0 75 ef 49 83 c4 08 4c 39 e6 75 dd 85 c9 74 73 0f b6 45 c0 45 31 d2 89 45 d0 4d 8b 3e 4d 85 ff 74 53 <41> 0f b7 4f 16 66 81 f9 ff 0f 77 3f 0f b7 b3 ea 0c 00 00 8b 55 d0
RSP: 0018:ffffb987b5c77760 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff9bb5df5a9000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000006000000 RDI: ffff9bace4bce350
RBP: ffffb987b5c777b0 R08: 0000000000000000 R09: ffff9bace56a9da0
R10: 0000000000000000 R11: 0000000000000100 R12: ffff9bb5df5a9a28
R13: ffff9bace4bce008 R14: ffff9bb5df5a9338 R15: 26c2b975d54f5980
FS: 00007f9f07fff700(0000) GS:ffff9bfcff480000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa73c9c0e10 CR3: 000000f6ab37a002 CR4: 00000000007626e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
i40e_ndo_set_vf_port_vlan+0x1a2/0x440[i40e]
do_setlink+0x53f/0xee0
?update_load_avg+0x596/0x620
?update_curr+0x7a/0x1d0
?__switch_to_asm+0x40/0x70
?__switch_to_asm+0x34/0x70
?__switch_to_asm+0x40/0x70
?__switch_to_asm+0x34/0x70
rtnl_setlink+0x113/0x150
rtnetlink_rcv_msg+0x296/0x340
?aa_label_sk_perm.part.4+0x10f/0x160
?_cond_resched+0x19/0x40
?rtnl_calcit.isra.30+0x120/0x120
netlink_rcv_skb+0x51/0x120
rtnetlink_rcv+0x15/0x20
netlink_unicast+0x1a4/0x250
netlink_sendmsg+0x2d7/0x3d0
sock_sendmsg+0x63/0x70
___sys_sendmsg+0x2a9/0x320
?aa_label_sk_perm.part.4+0x10f/0x160
?_raw_spin_unlock_bh+0x1e/0x20
?release_sock+0x8f/0xa0
__sys_sendmsg+0x63/0xa0
?__sys_sendmsg+0x63/0xa0
__x64_sys_sendmsg+0x1f/0x30
do_syscall_64+0x5a/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9

This issue also happens when deleting k8s pod if VF is used by k8s pod, there was a bug reported in the e1000-devel mailing list
https://sourceforge.net/p/e1000/mailman/message/36766306/

The fix is suggested by Billy McFall, to add protection when accessing the hash list(vsi->mac_filter_hash), but it's not upstream yet

[Test Case]
Spin up some VMs with VFs, then delete all VMs

[Regression Potential]
Low, the fix is to add a protection for a hash list, shouldn't have potential regression

Changed in linux (Ubuntu):
assignee: nobody → gerald.yang (gerald-yang-tw)
Changed in linux (Ubuntu Eoan):
assignee: nobody → gerald.yang (gerald-yang-tw)
Changed in linux (Ubuntu):
status: New → In Progress
Changed in linux (Ubuntu Eoan):
status: New → In Progress
tags: added: sts
Changed in linux (Ubuntu Eoan):
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
Changed in linux (Ubuntu Eoan):
status: Fix Committed → Invalid
status: Invalid → Fix Committed
Nobuto Murata (nobuto) wrote :

The general protection fault is reproducible with the current 5.3 kernel as follows by creating 10 SR-IOV enabled VMs with OpenStack and deleting 5 VMs sequentially. After updating it to the -proposed one as
 linux-image-5.3.0-25-generic 5.3.0-25.27, there is no such general protection fault happened with the same operations. So we can consider the fix is verified.

tags: added: verification-done-eoan
removed: verification-needed-eoan
Launchpad Janitor (janitor) wrote :
Download full text (27.4 KiB)

This bug was fixed in the package linux - 5.3.0-26.28

---------------
linux (5.3.0-26.28) eoan; urgency=medium

  * eoan/linux: 5.3.0-26.28 -proposed tracker (LP: #1856807)

  * nvidia-435 is in eoan, linux-restricted-modules only builds against 430,
    ubiquity gives me the self-signed modules experience instead of using the
    Canonical-signed modules (LP: #1856407)
    - Add nvidia-435 dkms build

linux (5.3.0-25.27) eoan; urgency=medium

  * eoan/linux: 5.3.0-25.27 -proposed tracker (LP: #1854762)

  * CVE-2019-14901
    - SAUCE: mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()

  * CVE-2019-14896 // CVE-2019-14897
    - SAUCE: libertas: Fix two buffer overflows at parsing bss descriptor

  * CVE-2019-14895
    - SAUCE: mwifiex: fix possible heap overflow in mwifiex_process_country_ie()

  * [CML] New device id's for CMP-H (LP: #1846335)
    - mmc: sdhci-pci: Add another Id for Intel CML
    - i2c: i801: Add support for Intel Comet Lake PCH-H
    - mtd: spi-nor: intel-spi: Add support for Intel Comet Lake-H SPI serial flash
    - mfd: intel-lpss: Add Intel Comet Lake PCH-H PCI IDs

  * i915: Display flickers (monitor loses signal briefly) during "flickerfree"
    boot, while showing the BIOS logo on a black background (LP: #1836858)
    - [Config] FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER=y

  * Please add patch fixing RK818 ID detection (LP: #1853192)
    - SAUCE: mfd: rk808: Fix RK818 ID template

  * Kernel build log filled with "/bin/bash: line 5: warning: command
    substitution: ignored null byte in input" (LP: #1853843)
    - [Debian] Fix warnings when checking for modules signatures

  * Lenovo dock MAC Address pass through doesn't work in Ubuntu (LP: #1827961)
    - r8152: Add macpassthru support for ThinkPad Thunderbolt 3 Dock Gen 2

  * Dell XPS 13 9350/9360 headphone audio hiss (LP: #1654448) // [XPS 13 9360,
    Realtek ALC3246, Black Headphone Out, Front] High noise floor (LP: #1845810)
    - ALSA: hda/realtek: Reduce the Headphone static noise on XPS 9350/9360

  * no HDMI video output since GDM greeter after linux-oem-osp1 version
    5.0.0-1026 (LP: #1852386)
    - drm/i915: Add new CNL PCH ID seen on a CML platform
    - SAUCE: drm/i915: Fix detection for a CMP-V PCH

  * [broadwell-rt286, playback] Since Linux 5.2rc2 audio playback no longer
    works on Dell Venue 11 Pro 7140 (LP: #1846539)
    - [Config] Drop snd-sof-intel-bdw build
    - SAUCE: ASoC: SOF: Intel: Broadwell: clarify mutual exclusion with legacy
      driver

  * [CML-S62] Need enable turbostat patch support for Comet lake- S 6+2
    (LP: #1847451)
    - SAUCE: tools/power turbostat: Add Cometlake support

  * External microphone can't work on some dell machines with the codec alc256
    or alc236 (LP: #1853791)
    - SAUCE: ALSA: hda/realtek - Move some alc256 pintbls to fallback table
    - SAUCE: ALSA: hda/realtek - Move some alc236 pintbls to fallback table

  * Memory leak in net/xfrm/xfrm_state.c - 8 pages per ipsec connection
    (LP: #1853197)
    - xfrm: Fix memleak on xfrm state destroy

  * CVE-2019-18660: patches for Ubuntu (LP: #1853142) // CVE-2019-18660
    - powerpc/64s: support nospectre_v2 cmdline option
    - powerp...

Changed in linux (Ubuntu Eoan):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (8.6 KiB)

This bug was fixed in the package linux - 5.4.0-9.12

---------------
linux (5.4.0-9.12) focal; urgency=medium

  * alsa/hda/realtek: the line-out jack doens't work on a dell AIO
    (LP: #1855999)
    - SAUCE: ALSA: hda/realtek - Line-out jack doesn't work on a Dell AIO

  * scsi: hisi_sas: Check sas_port before using it (LP: #1855952)
    - scsi: hisi_sas: Check sas_port before using it

  * CVE-2019-19078
    - ath10k: fix memory leak

  * cifs: DFS Caching feature causing problems traversing multi-tier DFS setups
    (LP: #1854887)
    - cifs: Fix retrieval of DFS referrals in cifs_mount()

  * Support DPCD aux brightness control (LP: #1856134)
    - SAUCE: drm/i915: Fix eDP DPCD aux max backlight calculations
    - SAUCE: drm/i915: Assume 100% brightness when not in DPCD control mode
    - SAUCE: drm/i915: Fix DPCD register order in intel_dp_aux_enable_backlight()
    - SAUCE: drm/i915: Auto detect DPCD backlight support by default
    - SAUCE: drm/i915: Force DPCD backlight mode on X1 Extreme 2nd Gen 4K AMOLED
      panel
    - USUNTU: SAUCE: drm/i915: Force DPCD backlight mode on Dell Precision 4K sku

  * The system cannot resume from S3 if user unplugs the TB16 during suspend
    state (LP: #1849269)
    - PCI: pciehp: Do not disable interrupt twice on suspend
    - PCI: pciehp: Prevent deadlock on disconnect

  * change kconfig of the soundwire bus driver from y to m (LP: #1855685)
    - [Config]: SOUNDWIRE=m

  * alsa/sof: change to use hda hdmi codec driver to make hdmi audio on the
    docking station work (LP: #1855666)
    - ALSA: hda/hdmi - implement mst_no_extra_pcms flag
    - ASoC: hdac_hda: add support for HDMI/DP as a HDA codec
    - ASoC: Intel: skl-hda-dsp-generic: use snd-hda-codec-hdmi
    - ASoC: Intel: skl-hda-dsp-generic: fix include guard name
    - ASoC: SOF: Intel: add support for snd-hda-codec-hdmi
    - ASoC: Intel: bxt-da7219-max98357a: common hdmi codec support
    - ASoC: Intel: glk_rt5682_max98357a: common hdmi codec support
    - ASoC: intel: sof_rt5682: common hdmi codec support
    - ASoC: Intel: bxt_rt298: common hdmi codec support
    - ASoC: SOF: enable sync_write in hdac_bus
    - [config]: SND_SOC_SOF_HDA_COMMON_HDMI_CODEC=y

  * Fix unusable USB hub on Dell TB16 after S3 (LP: #1855312)
    - SAUCE: USB: core: Make port power cycle a seperate helper function
    - SAUCE: USB: core: Attempt power cycle port when it's in eSS.Disabled state

  * Focal update: v5.4.3 upstream stable release (LP: #1856583)
    - rsi: release skb if rsi_prepare_beacon fails
    - arm64: tegra: Fix 'active-low' warning for Jetson TX1 regulator
    - arm64: tegra: Fix 'active-low' warning for Jetson Xavier regulator
    - perf scripts python: exported-sql-viewer.py: Fix use of TRUE with SQLite
    - sparc64: implement ioremap_uc
    - lp: fix sparc64 LPSETTIMEOUT ioctl
    - time: Zero the upper 32-bits in __kernel_timespec on 32-bit
    - mailbox: tegra: Fix superfluous IRQ error message
    - staging/octeon: Use stubs for MIPS && !CAVIUM_OCTEON_SOC
    - usb: gadget: u_serial: add missing port entry locking
    - serial: 8250-mtk: Use platform_get_irq_optional() for optional irq
    - tty: serial: fsl_lpuart: use the sg ...

Read more...

Changed in linux (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers