i40e: Setting VF MAC address causes General Protection Fault

Bug #1852432 reported by Heitor Alves de Siqueira on 2019-11-13
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Status tracked in Focal
Bionic
High
Heitor Alves de Siqueira
Disco
High
Heitor Alves de Siqueira
Eoan
High
Heitor Alves de Siqueira
Focal
High
Heitor Alves de Siqueira

Bug Description

[Impact]
 * Creating SR-IOV enabled VMs in Openstack can sometimes trigger the GPF and leave system unusable

[Test Case]
 * Continuously spin up VFs and set MAC address with e.g. ifconfig

[Fix]
 * The fix updates the VSI pointer passed down to i40e_set_vf_mac function() if the adapter is still in reset, preventing the GPF.

[Regression Potential]
 * Regression potential should be low, as we're now updating the VSI using the ID stored in the VF pointer
 * Regressions could arise from issues in VF creation or reset, as that would corrupt the new VSI pointer
 * Patch was validated and tested in a production environment

description: updated
Changed in linux (Ubuntu Eoan):
status: New → Confirmed
Changed in linux (Ubuntu Disco):
status: New → Confirmed
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Changed in linux (Ubuntu Xenial):
status: New → Confirmed
Changed in linux (Ubuntu Eoan):
importance: Undecided → High
Changed in linux (Ubuntu Disco):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu Eoan):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Disco):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Heitor Alves de Siqueira (halves)
no longer affects: linux (Ubuntu Xenial)
description: updated
Changed in linux (Ubuntu Bionic):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Disco):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Eoan):
status: Confirmed → In Progress
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco
tags: added: verification-needed-bionic

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
Nobuto Murata (nobuto) wrote :

The general protection fault was reproducible with the current 5.3 kernel as follows by creating 10 SR-IOV sequentially. After updating it to the -proposed one as
 linux-image-5.3.0-25-generic 5.3.0-25.27, there is no such general protection fault happened with the same operations. So we can consider the fix is verified with the eoan-proposed kernel.

tags: added: verification-done-eoan
removed: verification-needed-eoan
Download full text (4.7 KiB)

Verified on Bionic with the following test case:
$ echo 64 | sudo tee /sys/class/net/ens1f0/device/sriov_numvfs && virsh attach-interface valuable-bluefish hostdev 0000:08:02.6 --managed --live

On 4.15.0-72.81, I get the following GPF:
[119044.656412] general protection fault: 0000 [#1] SMP PTI
[119044.656455] Modules linked in: i40evf vfio_pci vfio_virqfd vfio_iommu_type1 vfio vhost_net vhost tap ebtable_filter ebtables devlink ip6t
able_filter ip6_tables kvm_intel binfmt_misc ipt_REJECT nf_reject_ipv4 xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_CHECKSUM xt_comm
ent xt_tcpudp iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter bridge stp llc dummy
 ixgbevf ipmi_ssif nls_iso8859_1 intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass intel_cstate intel_rapl_perf
 lpc_ich hpilo ioatdma shpchp ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_power_meter sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10
[119044.656929] raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag2
00 i2c_algo_bit crct10dif_pclmul crc32_pclmul ttm ghash_clmulni_intel ses pcbc enclosure drm_kms_helper aesni_intel syscopyarea aes_x86_64 cr
ypto_simd sysfillrect glue_helper cryptd sysimgblt ixgbe fb_sys_fops i40e tg3 dca drm nvme ptp hpsa pps_core nvme_core mdio scsi_transport_sa
s wmi [last unloaded: kvm_intel]
[119044.657204] CPU: 11 PID: 10300 Comm: libvirtd Not tainted 4.15.0-72-generic #81-Ubuntu
[119044.657255] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 05/06/2015
[119044.657310] RIP: 0010:i40e_sync_vsi_filters+0x95/0xd00 [i40e]
[119044.657349] RSP: 0018:ffffbcc5479fb718 EFLAGS: 00010202
[119044.657385] RAX: 77b7ed74a738c96d RBX: ffff9aad58382000 RCX: 0000000000000000
[119044.657431] RDX: 0000000000000001 RSI: 00000000fffffe01 RDI: ffff9aad58382000
[119044.657477] RBP: ffffbcc5479fb7b8 R08: 0000000000000000 R09: 0000000000003494
[119044.657523] R10: 0000000000000000 R11: 0000d9714ff8ecd4 R12: ffff9aad58382000
[119044.657569] R13: ffff9aad384662a0 R14: ffff9aad58382a28 R15: ffff9aad67b1643c
[119044.657616] FS: 00007fd7aef27700(0000) GS:ffff9aad7f0c0000(0000) knlGS:0000000000000000
[119044.657668] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[119044.657706] CR2: 00007fd79001c058 CR3: 0000001029f30001 CR4: 00000000001626e0
[119044.657752] Call Trace:
[119044.657778] ? del_timer_sync+0x45/0x50
[119044.657809] ? __next_timer_interrupt+0xe0/0xe0
[119044.657851] i40e_ndo_set_vf_mac+0x109/0x2b0 [i40e]
[119044.657890] do_setlink+0x8a5/0xed0
[119044.657919] ? kmalloc_large_node+0x3b/0x60
[119044.657951] ? security_sock_rcv_skb+0x41/0x60
[119044.657986] rtnl_setlink+0xdc/0x130
[119044.658017] rtnetlink_rcv_msg+0x221/0x2b0
[119044.658049] ? aa_label_sk_perm+0x129/0x140
[119044.658081] ? _cond_resched+0x19/0x40
[119044.658110] ? rtnl_calcit.isra.25+0x110/0x110
[119044.658142] netlink_rcv_skb+0x54/0x130
[119044.658172] rtnetlink_rcv+0x15/0x20
[119044.658198] netlink_unicast+0x19e/0x240
[11904...

Read more...

tags: added: verification-done-bionic
removed: verification-needed-bionic

Verified on disco with the following test case:

$ uname -r
5.0.0-38-generic
$ echo 64 | sudo tee /sys/class/net/ens1f0/device/sriov_numvfs && virsh attach-interface iov hostdev 0000:08:02.6 --managed --live
64
Interface attached successfully

tags: added: verification-done-disco
removed: verification-needed-disco
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers