CacheFiles: Error: Overlong wait for old active object to go away.

Bug #1776254 reported by Kiran Kumar Modukuri
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned

Bug Description

== SRU Justification ==

[Impact]
Oops during heavy NFS + FSCache + Cachefiles use:

 CacheFiles: Error: Overlong wait for old active object to go away.
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000002

 CacheFiles: Error: Object already active
 kernel BUG at fs/cachefiles/namei.c:163!

[Cause]
  In a heavily loaded system with big files being read and truncated,
  an fscache object for a cookie is being dropped and a new object being looked.
  The new object being looked for has to wait for the old object to go away before the
  new object is moved to active state.

[Fix]
     Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when retrying
     the object lookup.
     Remove the BUG() for the case where the old object is still being dropped
     and convert to WARN()

[Testcase]
A user has run ~100 hours of NFS stress tests and not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1776254

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Triaged
Changed in linux (Ubuntu Trusty):
status: New → Fix Committed
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'. If the problem still exists, change the tag 'verification-needed-trusty' to 'verification-failed-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Changed in linux (Ubuntu Bionic):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (12.2 KiB)

This bug was fixed in the package linux - 4.17.0-9.10

---------------
linux (4.17.0-9.10) cosmic; urgency=medium

  * linux: 4.17.0-9.10 -proposed tracker (LP: #1787988)

  * Cosmic update to 4.17.17 stable release (LP: #1787973)
    - x86/speculation/l1tf: Exempt zeroed PTEs from inversion
    - Linux 4.17.17

  * Cosmic update to 4.17.16 stable release (LP: #1787972)
    - x86/l1tf: Fix build error seen if CONFIG_KVM_INTEL is disabled
    - x86: i8259: Add missing include file
    - x86/platform/UV: Mark memblock related init code and data correctly
    - x86/mm/pti: Clear Global bit more aggressively
    - xen/pv: Call get_cpu_address_sizes to set x86_virt/phys_bits
    - x86/mm: Disable ioremap free page handling on x86-PAE
    - kbuild: verify that $DEPMOD is installed
    - crypto: ccree - fix finup
    - crypto: ccree - fix iv handling
    - crypto: ccp - Check for NULL PSP pointer at module unload
    - crypto: ccp - Fix command completion detection race
    - crypto: x86/sha256-mb - fix digest copy in sha256_mb_mgr_get_comp_job_avx2()
    - crypto: vmac - require a block cipher with 128-bit block size
    - crypto: vmac - separate tfm and request context
    - crypto: blkcipher - fix crash flushing dcache in error path
    - crypto: ablkcipher - fix crash flushing dcache in error path
    - crypto: skcipher - fix aligning block size in skcipher_copy_iv()
    - crypto: skcipher - fix crash flushing dcache in error path
    - ioremap: Update pgtable free interfaces with addr
    - x86/mm: Add TLB purge to free pmd/pte page interfaces
    - Linux 4.17.16

  * Cosmic update to 4.17.16 stable release (LP: #1787972) // CVE-2018-9363
    - Bluetooth: hidp: buffer overflow in hidp_process_report

  * linux-cloud-tools-common: Ensure hv-kvp-daemon.service starts before
    walinuxagent.service (LP: #1739107)
    - [Debian] hyper-v -- Ensure that hv-kvp-daemon.service starts before
      walinuxagent.service

  * Miscellaneous Ubuntu changes
    - [Packaging] retpoline -- fix temporary filenaming

linux (4.17.0-8.9) cosmic; urgency=medium

  * linux: 4.17.0-8.9 -proposed tracker (LP: #1787259)

  * Cosmic update to v4.17.15 stable release (LP: #1787257)
    - parisc: Enable CONFIG_MLONGCALLS by default
    - parisc: Define mb() and add memory barriers to assembler unlock sequences
    - Mark HI and TASKLET softirq synchronous
    - stop_machine: Disable preemption after queueing stopper threads
    - sched/deadline: Update rq_clock of later_rq when pushing a task
    - zram: remove BD_CAP_SYNCHRONOUS_IO with writeback feature
    - xen/netfront: don't cache skb_shinfo()
    - bpf, sockmap: fix leak in bpf_tcp_sendmsg wait for mem path
    - bpf, sockmap: fix bpf_tcp_sendmsg sock error handling
    - scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management
      enabled
    - scsi: qla2xxx: Fix memory leak for allocating abort IOCB
    - init: rename and re-order boot_cpu_state_init()
    - root dentries need RCU-delayed freeing
    - make sure that __dentry_kill() always invalidates d_seq, unhashed or not
    - fix mntput/mntput race
    - fix __legitimize_mnt()/mntput() race
    - ARM: dts: imx6sx: fix irq for pcie bridge
  ...

Changed in linux (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
David Coronel (davecore) wrote :

I have confirmation that the kernel in -proposed fixes the issue in xenial.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (32.9 KiB)

This bug was fixed in the package linux - 4.15.0-34.37

---------------
linux (4.15.0-34.37) bionic; urgency=medium

  * linux: 4.15.0-34.37 -proposed tracker (LP: #1788744)

  * Bionic update: upstream stable patchset 2018-08-09 (LP: #1786352)
    - MIPS: c-r4k: Fix data corruption related to cache coherence
    - MIPS: ptrace: Expose FIR register through FP regset
    - MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
    - KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
    - affs_lookup(): close a race with affs_remove_link()
    - fs: don't scan the inode cache before SB_BORN is set
    - aio: fix io_destroy(2) vs. lookup_ioctx() race
    - ALSA: timer: Fix pause event notification
    - do d_instantiate/unlock_new_inode combinations safely
    - mmc: sdhci-iproc: remove hard coded mmc cap 1.8v
    - mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
    - mmc: sdhci-iproc: add SDHCI_QUIRK2_HOST_OFF_CARD_ON for cygnus
    - libata: Blacklist some Sandisk SSDs for NCQ
    - libata: blacklist Micron 500IT SSD with MU01 firmware
    - xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
    - drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN|OUT] macros
    - arm64: lse: Add early clobbers to some input/output asm operands
    - powerpc/64s: Clear PCR on boot
    - IB/hfi1: Use after free race condition in send context error path
    - IB/umem: Use the correct mm during ib_umem_release
    - idr: fix invalid ptr dereference on item delete
    - Revert "ipc/shm: Fix shmat mmap nil-page protection"
    - ipc/shm: fix shmat() nil address after round-down when remapping
    - mm/kasan: don't vfree() nonexistent vm_area
    - kasan: free allocated shadow memory on MEM_CANCEL_ONLINE
    - kasan: fix memory hotplug during boot
    - kernel/sys.c: fix potential Spectre v1 issue
    - KVM: s390: vsie: fix < 8k check for the itdba
    - KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed
    - kvm: x86: IA32_ARCH_CAPABILITIES is always supported
    - powerpc/64s: Improve RFI L1-D cache flush fallback
    - powerpc/pseries: Restore default security feature flags on setup
    - powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
    - MIPS: generic: Fix machine compatible matching
    - mac80211: mesh: fix wrong mesh TTL offset calculation
    - ARC: Fix malformed ARC_EMUL_UNALIGNED default
    - ptr_ring: prevent integer overflow when calculating size
    - arm64: dts: rockchip: fix rock64 gmac2io stability issues
    - arm64: dts: rockchip: correct ep-gpios for rk3399-sapphire
    - libata: Fix compile warning with ATA_DEBUG enabled
    - selftests: sync: missing CFLAGS while compiling
    - selftest/vDSO: fix O=
    - selftests: pstore: Adding config fragment CONFIG_PSTORE_RAM=m
    - selftests: memfd: add config fragment for fuse
    - ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
    - ARM: OMAP3: Fix prm wake interrupt for resume
    - ARM: OMAP2+: Fix sar_base inititalization for HS omaps
    - ARM: OMAP1: clock: Fix debugfs_create_*() usage
    - tls: retrun the correct IV in getsockopt
    - xhci: workaround for AMD Promontory disabled ports w...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.13.0-158.208

---------------
linux (3.13.0-158.208) trusty; urgency=medium

  * linux: 3.13.0-158.208 -proposed tracker (LP: #1788764)

  * CVE-2018-3620 // CVE-2018-3646
    - SAUCE: x86/fremap: Invert the offset when converting to/from a PTE

  * BUG: scheduling while atomic (Kernel : Ubuntu-3.13 + VMware: 6.0 and late)
    (LP: #1780470)
    - VSOCK: sock_put wasn't safe to call in interrupt context
    - VSOCK: Fix lockdep issue.
    - VSOCK: Detach QP check should filter out non matching QPs.

  * CacheFiles: Error: Overlong wait for old active object to go away.
    (LP: #1776254)
    - cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
    - cachefiles: Wait rather than BUG'ing on "Unexpected object collision"

  * fscache cookie refcount updated incorrectly during fscache object allocation
    (LP: #1776277)
    - fscache: Fix reference overput in fscache_attach_object() error handling

  * FS-Cache: Assertion failed: FS-Cache: 6 == 5 is false (LP: #1774336)
    - Revert "UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race"
    - fscache: Allow cancelled operations to be enqueued
    - cachefiles: Fix refcounting bug in backing-file read monitoring

 -- Kleber Sacilotto de Souza <email address hidden> Fri, 24 Aug 2018 15:08:23 +0000

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-135.161

---------------
linux (4.4.0-135.161) xenial; urgency=medium

  * linux: 4.4.0-135.161 -proposed tracker (LP: #1788766)

  * [Regression] APM Merlin boards fail to recover link after interface down/up
    (LP: #1785739)
    - net: phylib: fix interrupts re-enablement in phy_start
    - net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT

  * qeth: don't clobber buffer on async TX completion (LP: #1786057)
    - s390/qeth: don't clobber buffer on async TX completion

  * nvme: avoid cqe corruption (LP: #1788035)
    - nvme: avoid cqe corruption when update at the same time as read

  * CacheFiles: Error: Overlong wait for old active object to go away.
    (LP: #1776254)
    - cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
    - cachefiles: Wait rather than BUG'ing on "Unexpected object collision"

  * fscache cookie refcount updated incorrectly during fscache object allocation
    (LP: #1776277) // fscache cookie refcount updated incorrectly during fscache
    object allocation (LP: #1776277)
    - fscache: Fix reference overput in fscache_attach_object() error handling

  * FS-Cache: Assertion failed: FS-Cache: 6 == 5 is false (LP: #1774336)
    - Revert "UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race"
    - fscache: Allow cancelled operations to be enqueued
    - cachefiles: Fix refcounting bug in backing-file read monitoring

  * linux-cloud-tools-common: Ensure hv-kvp-daemon.service starts before
    walinuxagent.service (LP: #1739107)
    - [Debian] hyper-v -- Ensure that hv-kvp-daemon.service starts before
      walinuxagent.service

 -- Khalid Elmously <email address hidden> Sun, 26 Aug 2018 23:56:50 -0400

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Andy Whitcroft (apw)
tags: added: kernel-fixup-verification-needed-bionic
removed: verification-needed-bionic
Revision history for this message
Andy Whitcroft (apw) wrote :

This bug was erroneously marked for verification in bionic; verification is not required and verification-needed-bionic is being removed.

tags: added: verification-done-bionic
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.