kernel BUG at /build/linux-vxxS7y/linux-4.15.0/mm/slub.c:296!

Bug #1812086 reported by Juerg Haefliger
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
Bionic
Fix Released
Medium
Unassigned

Bug Description

== SRU Justification ==

Rebooting an iSCSI target while the initiator is writing to a LUN leads to the following trace:

[ 59.879202] ------------[ cut here ]------------
[ 59.879202] kernel BUG at /build/linux-vxxS7y/linux-4.15.0/mm/slub.c:296!
[ 59.880636] invalid opcode: 0000 [#1] SMP PTI
[ 59.881569] Modules linked in: iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_user uio target_core_mod nls_iso8859_1 kvm_intel isofs kvm irqbypass joydev input_leds serio_raw sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear psmouse virtio_blk virtio_net floppy
[ 59.891096] CPU: 0 PID: 1027 Comm: iscsi_np Not tainted 4.15.0-43-generic #46-Ubuntu
[ 59.892726] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1ubuntu1 04/01/2014
[ 59.894606] RIP: 0010:kfree+0x16a/0x180
[ 59.895429] RSP: 0018:ffffac0d8050fe58 EFLAGS: 00010246
[ 59.896531] RAX: ffff9cf099475800 RBX: ffff9cf099475800 RCX: ffff9cf099475800
[ 59.898083] RDX: 0000000000011bbb RSI: ffff9cf09fc27140 RDI: ffff9cf09f002000
[ 59.899627] RBP: ffffac0d8050fe70 R08: 0000000000000000 R09: ffffffffc07a329b
[ 59.901186] R10: ffffe95780651d40 R11: ffffffffa511dc90 R12: ffff9cf099625600
[ 59.902769] R13: ffffffffc07a329b R14: ffff9cf09ee07600 R15: ffff9cf099475800
[ 59.904321] FS: 0000000000000000(0000) GS:ffff9cf09fc00000(0000) knlGS:0000000000000000
[ 59.906120] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 59.907806] CR2: 00007f7153b88470 CR3: 000000001babe000 CR4: 00000000000006f0
[ 59.909376] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 59.910950] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 59.913098] Call Trace:
[ 59.913783] iscsi_target_login_sess_out+0x1fb/0x250 [iscsi_target_mod]
[ 59.915292] iscsi_target_login_thread+0x44d/0x1060 [iscsi_target_mod]
[ 59.916775] kthread+0x121/0x140
[ 59.917622] ? iscsi_target_login_sess_out+0x250/0x250 [iscsi_target_mod]
[ 59.919244] ? kthread_create_worker_on_cpu+0x70/0x70
[ 59.920483] ? do_syscall_64+0x73/0x130
[ 59.921460] ? SyS_exit_group+0x14/0x20
[ 59.922583] ret_from_fork+0x35/0x40
[ 59.923523] Code: c4 80 74 04 41 8b 72 6c 4c 89 d7 e8 61 1c f9 ff eb 86 41 b8 01 00 00 00 48 89 d9 48 89 da 4c 89 d6 e8 8b f6 ff ff e9 6d ff ff ff <0f> 0b 48 8b 3d 6d c4 1c 01 e9 c9 fe ff ff 0f 1f 84 00 00 00 00
[ 59.927778] RIP: kfree+0x16a/0x180 RSP: ffffac0d8050fe58
[ 59.929063] ---[ end trace 082da4d341633d3e ]---

== Fix ==

Backport the following 3 commits:
 * scsi: iscsi: target: Fix conn_ops double free
 * scsi: iscsi: target: Set conn->sess to NULL when
   iscsi_login_set_conn_values fails
 * iscsi target: fix session creation failure handling

== Regression Potential ==

Low. Clean cherry-picks that modify a very isolated area.

== Test ==

Setup an iSCSI target using the scsi_target_user module and tcmu_runner. Setup an initiator to connect to the target and do IOs. Reboot the target. When the target comes back, the kernel falls over when the initiator tries to re-connect.

Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1812086

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Juerg Haefliger (juergh)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Juerg Haefliger (juergh) wrote :

For clarification, this happens on the target after it comes back from the reboot and restores the iSCSI configuration.

Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
Juerg Haefliger (juergh)
description: updated
description: updated
Changed in linux (Ubuntu Bionic):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Juerg Haefliger (juergh) wrote :

Tested successfully.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (11.4 KiB)

This bug was fixed in the package linux - 4.15.0-46.49

---------------
linux (4.15.0-46.49) bionic; urgency=medium

  * linux: 4.15.0-46.49 -proposed tracker (LP: #1814726)

  * mprotect fails on ext4 with dax (LP: #1799237)
    - x86/speculation/l1tf: Exempt zeroed PTEs from inversion

  * kernel BUG at /build/linux-vxxS7y/linux-4.15.0/mm/slub.c:296! (LP: #1812086)
    - iscsi target: fix session creation failure handling
    - scsi: iscsi: target: Set conn->sess to NULL when iscsi_login_set_conn_values
      fails
    - scsi: iscsi: target: Fix conn_ops double free

  * user_copy in user from ubuntu_kernel_selftests failed on KVM kernel
    (LP: #1812198)
    - selftests: user: return Kselftest Skip code for skipped tests
    - selftests: kselftest: change KSFT_SKIP=4 instead of KSFT_PASS
    - selftests: kselftest: Remove outdated comment

  * RTL8822BE WiFi Disabled in Kernel 4.18.0-12 (LP: #1806472)
    - SAUCE: staging: rtlwifi: allow RTLWIFI_DEBUG_ST to be disabled
    - [Config] CONFIG_RTLWIFI_DEBUG_ST=n
    - SAUCE: Add r8822be to signature inclusion list

  * kernel oops in bcache module (LP: #1793901)
    - SAUCE: bcache: never writeback a discard operation

  * CVE-2018-18397
    - userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails
    - userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem
    - userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas
    - userfaultfd: shmem: add i_size checks
    - userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set

  * Ignore "incomplete report" from Elan touchpanels (LP: #1813733)
    - HID: i2c-hid: Ignore input report if there's no data present on Elan
      touchpanels

  * Vsock connect fails with ENODEV for large CID (LP: #1813934)
    - vhost/vsock: fix vhost vsock cid hashing inconsistent

  * SRU: Fix thinkpad 11e 3rd boot hang (LP: #1804604)
    - ACPI / LPSS: Force LPSS quirks on boot

  * Bionic update: upstream stable patchset 2019-01-17 (LP: #1812229)
    - scsi: sd_zbc: Fix variable type and bogus comment
    - KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in
      parallel.
    - x86/apm: Don't access __preempt_count with zeroed fs
    - x86/events/intel/ds: Fix bts_interrupt_threshold alignment
    - x86/MCE: Remove min interval polling limitation
    - fat: fix memory allocation failure handling of match_strdup()
    - ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk
    - ARCv2: [plat-hsdk]: Save accl reg pair by default
    - ARC: Fix CONFIG_SWAP
    - ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigs
    - ARC: mm: allow mprotect to make stack mappings executable
    - mm: memcg: fix use after free in mem_cgroup_iter()
    - mm/huge_memory.c: fix data loss when splitting a file pmd
    - cpufreq: intel_pstate: Register when ACPI PCCH is present
    - vfio/pci: Fix potential Spectre v1
    - stop_machine: Disable preemption when waking two stopper threads
    - drm/i915: Fix hotplug irq ack on i965/g4x
    - drm/nouveau: Use drm_connector_list_iter_* for iterating connectors
    - drm/nouveau: Avoid looping through fake MST connectors
    - gen_stats: Fix netl...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Juerg Haefliger (juergh)
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Chris Read (chris-read) wrote :

Looks like this patch has not been applied to the 4.18 kernels yet:

kernel BUG at /build/linux-hwe-mCxn_j/linux-hwe-4.18.0/mm/slub.c:296!
invalid opcode: 0000 [#1] SMP PTI
CPU: 15 PID: 25039 Comm: kworker/15:0 Tainted: G OE 4.18.0-25-generic #26~18.04.1-Ubuntu
Workqueue: nfsiod rpc_async_release [sunrpc]
RIP: 0010:__slab_free+0x19c/0x340
Code: 00 48 89 c7 fa 66 0f 1f 44 00 00 f0 49 0f ba 2c 24 00 72 64 4d 3b 6c 24 20 74 11 49 0f ba 34 24 00 57 9d 0f 1f 44 00 00 eb a1 <0f> 0b 49 3b 54 24 28 75 e8 49 89 5c 24 20 49 89 4c 24 28 49 0f ba
RSP: 0018:ffffae647af3fc90 EFLAGS: 00010246
RAX: ffff93a5d88057c0 RBX: ffff93a5d88057c0 RCX: ffff93a5d88057c0
RDX: 0000000080400028 RSI: ffffd70cdf620140 RDI: ffff93961f407780
RBP: ffffae647af3fd30 R08: 0000000000000001 R09: ffffffffc0d8c536
R10: ffff93a5d88057c0 R11: 0000000000000001 R12: ffffd70cdf620140
R13: ffff93a5d88057c0 R14: ffff93961f407780 R15: 0ffffce643fbff40
FS: 0000000000000000(0000) GS:ffff93ae1f3c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbc681326c0 CR3: 0000002cde80a002 CR4: 00000000001606e0
Call Trace:
 ? __slab_free+0x21c/0x340
 ? percpu_counter_add_batch+0x4f/0x60
 ? filelayout_free_lseg+0x56/0x90 [nfs_layout_nfsv41_files]
 kfree+0x165/0x180
 ? __switch_to_asm+0x40/0x70
 ? kfree+0x165/0x180
 filelayout_free_lseg+0x56/0x90 [nfs_layout_nfsv41_files]
 pnfs_put_lseg+0xbc/0x130 [nfsv4]
 pnfs_writehdr_free+0x16/0x30 [nfsv4]
 nfs_write_completion+0xc6/0x1d0 [nfs]
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? refcount_dec_and_lock+0x12/0x50
 nfs_pgio_release+0x16/0x20 [nfs]
 pnfs_generic_rw_release+0x29/0x30 [nfsv4]
 rpc_free_task+0x33/0x70 [sunrpc]
 rpc_async_release+0x12/0x20 [sunrpc]
 process_one_work+0x1fd/0x3f0
 worker_thread+0x34/0x410
 kthread+0x121/0x140
 ? process_one_work+0x3f0/0x3f0
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x35/0x40

Revision history for this message
Juerg Haefliger (juergh) wrote :

That's a different stack trace and looks like a different problem that just results in the kernel dying in the same place. And 4.18 is no longer supported.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.