usbguard stops responding when recvmsg receives ENOBUFS

Bug #1855189 reported by Jamie Strandboge
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
usbguard
New
Unknown
usbguard (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Committed
Medium
Heitor Alves de Siqueira

Bug Description

[Impact]
usbguard-daemon will no longer process device events until restarted

[Test Plan]
1. Spin up a Focal VM with usbguard enabled
2. Run the script below:

  while /bin/true ; do
    sudo udevadm control --reload-rules
    sudo udevadm trigger
    sudo udevadm settle --timeout=3
  done

3. journald logs will indicate the following errors:
  usbguard-daemon[12488]: ueventProcessRead: failed to read pending uevent: rc=-1 errno=105
  usbguard-daemon[12488]: UEventDeviceManager thread: UEvent device manager: recvmsg: No buffer space available

[Where problems could occur]
The fix introduces an additional check in the usbguard event handler, so we should properly test this path. Potential regressions could cause USB devices to not be probed correctly, or usbguard dropping device events. We should also make sure the usbguard daemon is recovering from ENOBUFS and other errors correctly without needing to be restarted.

[Other Info]
The fix has been introduced upstream in version usbguard-0.7.7:
$ git describe --contains 5f297079b843
usbguard-0.7.7~2

As such, releases starting with Jammy already contain the fix:
$ rmadison usbguard
 usbguard | 0.7.2+ds-1 | bionic/universe | source, amd64, arm64, armhf, i386, ppc64el, s390x
 usbguard | 0.7.6+ds-1build1 | focal/universe | source, amd64, arm64, armhf, ppc64el, riscv64, s390x
 usbguard | 1.1.1+ds-3 | jammy/universe | source, amd64, arm64, armhf, ppc64el, riscv64, s390x
 usbguard | 1.1.2+ds-4 | mantic/universe | source, amd64, arm64, armhf, ppc64el, riscv64, s390x
 usbguard | 1.1.2+ds-6 | noble/universe | source, amd64, arm64, armhf, ppc64el, riscv64, s390x

--
[Original description]
With 0.7.4+ds-1 from 19.10, usbguard may stop responding to events when recvmsg fails with ENOBUFS. To reproduce:

  while /bin/true ; do
    sudo udevadm control --reload-rules
    sudo udevadm trigger
    sudo udevadm settle --timeout=3
  done

Eventually, this pops out in the journald logs:

  usbguard-daemon[12488]: ueventProcessRead: failed to read pending uevent: rc=-1 errno=105
  usbguard-daemon[12488]: UEventDeviceManager thread: UEvent device manager: recvmsg: No buffer space available

and usbguard will no longer process events until restarted (sudo systemctl restart usbguard).

I then installed usbguard 0.7.5 from the desktop team's PPA[1], ran the loop and saw the equivalent error pop out:

  usbguard-daemon[5958]: [1575487887.438] (E) ueventProcessRead: failed to read pending uevent: rc=-1 errno=105
  usbguard-daemon[5958]: [1575487887.438] (E) UEventDeviceManager thread: UEvent device manager: recvmsg: No buffer space available

This is coming from https://github.com/USBGuard/usbguard/blob/master/src/Library/UEventDeviceManager.cpp#L528 where recvmsg() is returning -1 with 'No buffer space available' (ENOBUFS). Looking at sysdeps/gnu/errlist.c in glibc, ENOBUFS is documented as:

  #ifdef ENOBUFS
  /*
  TRANS The kernel's buffers for I/O operations are all in use. In GNU, this
  TRANS error is always synonymous with @code{ENOMEM}; you may get one or the
  TRANS other from network operations. */
      [ERR_REMAP (ENOBUFS)] = N_("No buffer space available"),

It seems that usbguard should be able to handle transient ENOBUFS scenarios with uevent storms and recover when the kernel's buffers are back in order. In my testing, I treated ENOBUFS similarly to EAGAIN/EWOULDBLOCK by logging a warning and simply returning[2]:

  usbguard-daemon[2405]: ueventProcessRead: failed to read pending uevent (returning): rc=-1 errno=105

It appears usbguard continues to function (eg, if I do the loop for a long time, usbguard remains responsive and processes other uevents, but I'm not sure this is the correct approach since, for example, UEventDeviceManager::ueventOpen() speaks to this error condition by saying: "Set a 1MiB receive buffer on the netlink socket to avoid ENOBUFS error in recvmsg").

[1]https://launchpad.net/~ubuntu-desktop/+archive/ubuntu/usbguard
[2]exploratory patch (do not commit in Ubuntu without upstream confirmation)
Index: usbguard-0.7.4+ds/src/Library/UEventDeviceManager.cpp
===================================================================
--- usbguard-0.7.4+ds.orig/src/Library/UEventDeviceManager.cpp
+++ usbguard-0.7.4+ds/src/Library/UEventDeviceManager.cpp
@@ -468,6 +468,12 @@ namespace usbguard
           << "reading from uevent source would block thread execution";
         return;
       }
+ else if (saved_errno == ENOBUFS) {
+ USBGUARD_LOG(Error) << "ueventProcessRead: "
+ << "failed to read pending uevent (returning): "
+ << "rc=" << rc << " errno=" << saved_errno;
+ return;
+ }
       else {
         USBGUARD_LOG(Error) << "ueventProcessRead: "
           << "failed to read pending uevent: "

description: updated
Revision history for this message
Jamie Strandboge (jdstrand) wrote :

FYI, IMHO, this bug needs to be fixed as part of the MIR process in bug #1816548.

Changed in usbguard (Ubuntu):
status: New → Confirmed
Changed in usbguard:
status: Unknown → New
Revision history for this message
Jamie Strandboge (jdstrand) wrote :

FYI, upstream committed https://github.com/USBGuard/usbguard/pull/378 to address this issue. It is essentially what I mentioned in the bug description.

Changed in usbguard (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Sebastien Bacher (seb128) wrote :

The issue is fixed in the current version now

https://launchpad.net/ubuntu/+source/usbguard/0.7.8+ds-1

Changed in usbguard (Ubuntu):
status: Triaged → Fix Released
Changed in usbguard (Ubuntu Focal):
importance: Undecided → Medium
status: New → In Progress
assignee: nobody → Heitor Alves de Siqueira (halves)
description: updated
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

We've had users report this bug under Focal, and the usbguard pkg there does seem to be missing this fix. I've updated the bug with the SRU template, and tested the patches (builds available at [0]). Users also reported that this patched version resolved this issue, so I've pushed it to the Focal upload queue.

[0] https://launchpad.net/~halves/+archive/ubuntu/test

Revision history for this message
Robie Basak (racb) wrote :

Accepting, but in the Test Plan, please could you elaborate on how long you need to typically wait to see the reproducer, so that a third party will be able to verify if they are not affected? Eg. 1 minute, 1 hour, 1 day or 1 week? Then, during SRU verification, if we wait (say) double that time, then we can have some documented confidence that the bug is fixed. Thanks!

Changed in usbguard (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Jamie, or anyone else affected,

Accepted usbguard into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/usbguard/0.7.6+ds-1ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Thanks, Robie! Our users reported that issues usually appear within one or two hours, but that's likely tied to machine-specific workloads.

I've been able to force this issue to occur in a VM when running the test script from the description together with stress-ng memory/vm stressors. The "failed to read" logs then trigger consistently within a few minutes.

Considering that we do need memory pressure for the ENOBUFS issues to happen, I think this is a valid test scenario. I'll add this to the test plan section of the bug, and see if I can do a long duration test with the package from proposed.

Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
Download full text (3.6 KiB)

# VERIFICATION FOCAL

Validation performed on a 6-core VM with 8Gb or RAM. The following stress-ng command was running throughout the test from the description:
# stress-ng --class memory --class vm --all 1 --timeout 96h

Afterwards, the udevadm test loop was executed as below:
# date; while /bin/true; do
    sudo udevadm control --reload-rules
    sudo udevadm trigger
    sudo udevadm settle --timeout=3
  done
Thu Apr 11 19:19:18 UTC 2024
...

A few minutes later, the errors started showing up on journald:
# journalctl --follow -b -u usbguard | grep -A2 "failed to read"
Apr 11 19:21:59 z-rotomvm34 usbguard-daemon[197214]: [1712863318.753] (E) ueventProcessRead: failed to read pending uevent: rc=-1 errno=105
Apr 11 19:21:59 z-rotomvm34 usbguard-daemon[197214]: [1712863318.755] (E) UEventDeviceManager thread: UEvent device manager: recvmsg: No buffer space available

..and usbguard became unresponsive.

This seems good indication that the reproducer is valid. I then upgraded to the usbguard version from proposed and repeated the tests (plus stress-ng workload). The VM has been running for many hours since then, and journald indicates that we've triggered the ENOBUFS issues more than a few times. usbguard-daemon continued chugging along and responding to events, so we can mark this as verified.

Below are the verification logs.
###

$ dpkg -l | grep usbguard
ii libusbguard0 0.7.6+ds-1ubuntu1 amd64 USB device authorization policy framework - shared library
ii usbguard 0.7.6+ds-1ubuntu1 amd64 USB device authorization policy framework

$ date; while /bin/true; do
  sudo udevadm control --reload-rules
  sudo udevadm trigger
  sudo udevadm settle --timeout=3
done
Tue Apr 16 10:20:33 UTC 2024

# stress-ng --class memory --class vm --all 1 --timeout 96h

$ sudo journalctl -b -u usbguard | grep -A1 "failed to read"
Apr 16 13:40:40 z-rotomvm34 usbguard-daemon[712]: [1713274840.663] (E) ueventProcessRead: failed to read pending uevent (returning): rc=
-1 errno=105
Apr 16 13:40:40 z-rotomvm34 usbguard-daemon[712]: [1713274840.827] (A) uid=0 pid=712 result='SUCCESS' device.rule='allow id 1d6b:0001 se
rial "0000:00:04.0" name "UHCI Host Controller" hash "sKXn6PthDDlGgdxZHdnlUQ9DROkH/YSojkBlfpcnsaU=" parent-hash "9Ii0Zm8Mvu2nYz9z/EgAXJ/ed6bLW8Ctv1iUD5rh6qY=" via-port "usb2" with-interface 09:00:00 with-connect-type ""' device.system_name='/devices/pci0000:00/0000:00:04.0/usb2' type='Device.Present'
--
Apr 16 14:15:45 z-rotomvm34 usbguard-daemon[712]: [1713276945.707] (E) ueventProcessRead: failed to read pending uevent (returning): rc=-1 errno=105
Apr 16 14:15:45 z-rotomvm34 usbguard-daemon[712]: [1713276945.709] (A) uid=0 pid=712 result='SUCCESS' device.rule='allow id 0627:0001 serial "28754-0000:00:04.7-1" name "QEMU USB Tablet" hash "74abnf32ZntGQv1XK1k5t/IqWo3wxePUnjpidwCzMk4=" parent-hash "CsKOZ6IY8v3eojsc1fqKDW84V+MMhD6HsjjojcZBjSg=" via-port "1-1" with-interface 03:00:00 with-connect-type "unknown"' device.system_name='/devices/pci0000:00/0000:00:04.7/usb1/1-1' type='Device.Present'
--
Apr 16 14:43:09 z-rotomvm34 usbguard-daemon[712]: [1713278589.740]...

Read more...

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.