Doing multiple squashfs (and other loop?) mounts in parallel breaks

Bug #1836914 reported by John Lenton
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Critical
Unassigned

Bug Description

On a system running a 5.2 kernel, doing a large number of mounts of squashfs filesystems in parallel results in the mounts getting out of sync with their backing devices.

To reproduce,

https://paste.ubuntu.com/p/VCpzGxvy6h/

this breaks people with ~40 snaps (easily achievable given snapd leaves up to 3 previous revisions of snaps mounted at any given time, so a person with ~13 snaps that get updated often will hit this number).

ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: linux-image-5.2.0-8-generic 5.2.0-8.9
ProcVersionSignature: Ubuntu 5.2.0-8.9-generic 5.2.0
Uname: Linux 5.2.0-8-generic x86_64
ApportVersion: 2.20.11-0ubuntu5
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CurrentDesktop: ubuntu:GNOME
Date: Wed Jul 17 15:11:15 2019
InstallationDate: Installed on 2019-07-17 (0 days ago)
InstallationMedia: Ubuntu 19.10 "Eoan Ermine" - Alpha amd64 (20190716)
IwConfig:
 ens3 no wireless extensions.

 lo no wireless extensions.
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
ProcFB: 0 qxldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.2.0-8-generic root=UUID=9618ebbb-955d-4daf-b29a-37162ec7821a ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-5.2.0-8-generic N/A
 linux-backports-modules-5.2.0-8-generic N/A
 linux-firmware 1.180
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: Ubuntu-1.8.2-1ubuntu1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-xenial
dmi.modalias: dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-xenial:cvnQEMU:ct1:cvrpc-i440fx-xenial:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-xenial
dmi.sys.vendor: QEMU

CVE References

Revision history for this message
John Lenton (chipaca) wrote :
tags: added: rls-ee-incoming
Changed in linux (Ubuntu):
importance: Undecided → Critical
status: New → Confirmed
summary: - Doing multiple squashfs ((and other loop?) mounts in parallel breaks
+ Doing multiple squashfs (and other loop?) mounts in parallel breaks
Revision history for this message
John Lenton (chipaca) wrote :

I haven't yet tried to reproduce this with an upstream kernel, but given that it's affecting arch users with 5.2 as well, it's fairly likely to be there.

Revision history for this message
John Lenton (chipaca) wrote :

The mainline builds for i386 and amd64 are failing right now so not sure I can do more.

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

Test script

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Commit that causes the regression:
commit 33ec3e53e7b1869d7851e59e126bdb0fe0bd1982
Author: Jan Kara <email address hidden>
Date: Thu May 16 16:01:27 2019 +0200

    loop: Don't change loop device under exclusive opener

    Loop module allows calling LOOP_SET_FD while there are other openers of
    the loop device. Even exclusive ones. This can lead to weird
    consequences such as kernel deadlocks like:

    mount_bdev() lo_ioctl()
      udf_fill_super()
        udf_load_vrs()
          sb_set_blocksize() - sets desired block size B
          udf_tread()
            sb_bread()
              __bread_gfp(bdev, block, B)
                                              loop_set_fd()
                                                set_blocksize()
                - now __getblk_slow() indefinitely loops because B != bdev
                  block size

    Fix the problem by disallowing LOOP_SET_FD ioctl when there are
    exclusive openers of a loop device.

    [Deliberately chosen not to CC stable as a user with priviledges to
    trigger this race has other means of taking the system down and this
    has a potential of breaking some weird userspace setup]

Seems like it solves a race so I'll raise the issue to the commit author.

Revision history for this message
Seth Forshee (sforshee) wrote :

I confirmed the issue and that reverting the patch fixed it. I've pushed a revert as a temporary fix, Kai-Heng please keep me posted on the activity upstream.

Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
John Lenton (chipaca) wrote :

Can somebody build and test a kernel with Jan's patch, from
https://<email address hidden>/
?

Revision history for this message
John Lenton (chipaca) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

A test kernel can be found here:
https://people.canonical.com/~khfeng/lp1836914/

tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (37.9 KiB)

This bug was fixed in the package linux - 5.2.0-10.11

---------------
linux (5.2.0-10.11) eoan; urgency=medium

  * eoan/linux: 5.2.0-10.11 -proposed tracker (LP: #1838113)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync git-ubuntu-log

  * Eoan update: v5.2.4 upstream stable release (LP: #1838428)
    - bnx2x: Prevent load reordering in tx completion processing
    - caif-hsi: fix possible deadlock in cfhsi_exit_module()
    - hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback()
    - igmp: fix memory leak in igmpv3_del_delrec()
    - ipv4: don't set IPv6 only flags to IPv4 addresses
    - ipv6: rt6_check should return NULL if 'from' is NULL
    - ipv6: Unlink sibling route in case of failure
    - net: bcmgenet: use promisc for unsupported filters
    - net: dsa: mv88e6xxx: wait after reset deactivation
    - net: make skb_dst_force return true when dst is refcounted
    - net: neigh: fix multiple neigh timer scheduling
    - net: openvswitch: fix csum updates for MPLS actions
    - net: phy: sfp: hwmon: Fix scaling of RX power
    - net_sched: unset TCQ_F_CAN_BYPASS when adding filters
    - net: stmmac: Re-work the queue selection for TSO packets
    - net/tls: make sure offload also gets the keys wiped
    - nfc: fix potential illegal memory access
    - r8169: fix issue with confused RX unit after PHY power-down on RTL8411b
    - rxrpc: Fix send on a connected, but unbound socket
    - sctp: fix error handling on stream scheduler initialization
    - sctp: not bind the socket in sctp_connect
    - sky2: Disable MSI on ASUS P6T
    - tcp: be more careful in tcp_fragment()
    - tcp: fix tcp_set_congestion_control() use from bpf hook
    - tcp: Reset bytes_acked and bytes_received when disconnecting
    - vrf: make sure skb->data contains ip header to make routing
    - net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn
    - net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling
    - net: bridge: mcast: fix stale ipv6 hdr pointer when handling v6 query
    - net: bridge: don't cache ether dest pointer on input
    - net: bridge: stp: don't cache eth dest pointer before skb pull
    - macsec: fix use-after-free of skb during RX
    - macsec: fix checksumming after decryption
    - netrom: fix a memory leak in nr_rx_frame()
    - netrom: hold sock when setting skb->destructor
    - selftests: txring_overwrite: fix incorrect test of mmap() return value
    - net/tls: fix poll ignoring partially copied records
    - net/tls: reject offload of TLS 1.3
    - net/mlx5e: Fix port tunnel GRE entropy control
    - net/mlx5e: Rx, Fix checksum calculation for new hardware
    - net/mlx5e: Fix return value from timeout recover function
    - net/mlx5e: Fix error flow in tx reporter diagnose
    - bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
    - mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed
    - net/mlx5: E-Switch, Fix default encap mode
    - mlxsw: spectrum: Do not process learned records with a dummy FID
    - dma-buf: balance refcount inbalance
    - dma-buf: Discard old fence_excl on retrying get_fences_rcu for realloc
    - Revert "gpio/spi: Fix spi-gpio...

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.