Bug #2067631 “New oom-killer related crash for low RAM UC device...” : Bugs : snapd

Fernando Bravo Hernández (ferbraher) on 2024-05-30

description:

updated

Massimiliano Girardi (hook25) on 2024-05-31

description:

updated

Zygmunt Krynicki (zyga) on 2024-05-31

Changed in snapd:
assignee:	nobody → Zygmunt Krynicki (zyga)

Massimiliano Girardi (hook25) on 2024-05-31

description:

updated

Revision history for this message

Zygmunt Krynicki (zyga) wrote on 2024-06-04 (last edit on 2024-06-04):

#1

I've filed a jira task https://warthogs.atlassian.net/browse/SNAPDENG-22672 for internal scheduling.

Changed in snapd:
status:	New → In Progress

Revision history for this message

Zygmunt Krynicki (zyga) wrote on 2024-06-04:

#2

Download full text (5.3 KiB)

During the x86_64/efi core24 boot up process, with snapd 2.63, using the pre-created image from cdimage.ubuntu.com:

d82b5b9b86e7b592dc6b48edfbce7b16be6c5064c779db91f720a1ce071622a0 ubuntu-core-24-amd64.img.xz

I hit OOM during boot-up, when snapd.recovery-chooser-trigger crashes on boot:

Jun 04 10:58:03 localhost systemd[1]: Starting snapd.recovery-chooser-trigger.service - Wait for the Ubuntu Core chooser trigger...
Jun 04 10:58:03 localhost snap-bootstrap[147]: cmd_recovery_chooser_trigger.go:91: trigger wait timeout 10s
Jun 04 10:58:03 localhost snap-bootstrap[147]: cmd_recovery_chooser_trigger.go:92: device timeout 2s
Jun 04 10:58:03 localhost snap-bootstrap[147]: cmd_recovery_chooser_trigger.go:93: marker file /run/snapd-recovery-chooser-triggered
Jun 04 10:58:03 localhost snap-bootstrap[147]: triggerwatch.go:108: waiting for trigger key: KEY_1
Jun 04 10:58:03 localhost snap-bootstrap[147]: evdev.go:91: isa0060/serio0/input0: AT Translated Set 2 keyboard: starting wait, hold 2s to trigger
Jun 04 10:58:03 localhost snap-bootstrap[147]: evdev.go:91: isa0060/serio0/input0: AT Translated Set 2 keyboard: starting wait, hold 2s to trigger
Jun 04 10:58:08 localhost snap-bootstrap[147]: triggerwatch.go:146: Switching root
Jun 04 10:58:10 localhost systemd[1]: snapd.recovery-chooser-trigger.service: A process of this unit has been killed by the OOM killer.
Jun 04 10:58:17 localhost systemd[1]: Starting snapd.recovery-chooser-trigger.service - Wait for the Ubuntu Core chooser trigger...
Jun 04 10:58:19 localhost snap-bootstrap[904]: cmd_recovery_chooser_trigger.go:91: trigger wait timeout 10s
Jun 04 10:58:19 localhost snap-bootstrap[904]: cmd_recovery_chooser_trigger.go:92: device timeout 2s
Jun 04 10:58:19 localhost snap-bootstrap[904]: cmd_recovery_chooser_trigger.go:93: marker file /run/snapd-recovery-chooser-triggered
Jun 04 10:58:19 localhost snap-bootstrap[904]: triggerwatch.go:108: waiting for trigger key: KEY_1
Jun 04 10:58:19 localhost snap-bootstrap[904]: evdev.go:91: isa0060/serio0/input0: AT Translated Set 2 keyboard: starting wait, hold 2s to trigger
Jun 04 10:58:32 localhost systemd[1]: snapd.recovery-chooser-trigger.service: Main process exited, code=killed, status=9/KILL
Jun 04 10:58:32 localhost systemd[1]: snapd.recovery-chooser-trigger.service: Failed with result 'signal'.
Jun 04 10:58:32 localhost systemd[1]: Failed to start snapd.recovery-chooser-trigger.service - Wait for the Ubuntu Core chooser trigger.
Jun 04 10:58:32 localhost systemd[1]: snapd.recovery-chooser-trigger.service: Consumed 4.889s CPU time.

Rolling back snapd to 2.62 (21470) I get the same kind of crash on boot:

[ 10.166281] Out of memory: Killed process 147 (snap-bootstrap) total-vm:2240016kB, anon-rss:400284kB, file-rss:0kB, shmem-rss:9216kB, UID:0 pgtables:1004kB oom_score_adj:0

Jun 04 11:29:43 localhost systemd[1]: Starting snapd.recovery-chooser-trigger.service - Wait for the Ubuntu Core chooser trigger...
Jun 04 11:29:43 localhost snap-bootstrap[147]: cmd_recovery_chooser_trigger.go:91: trigger wait timeout 10s
Jun 04 11:29:43 localhost snap-bootstrap[147]: cmd_recovery_chooser_trigger.go:92: device timeout 2s
Jun 04 11:29:43 localhost snap-boot...

Tested snapd revisions:

snapd         2.62                             21473  latest/stable  canonical✓           snapd
snapd         2.63                             21761  latest/stable  canonical✓           snapd

I've limited my pi3b+ board to 400MB of memory by adding a constraint to piboot's config.txt

zyga@pi3-2:~$ tail -n2 /run/mnt/ubuntu-seed/config.txt

total_mem=400

I've got 304MB of total memory (some of the 400MB of "total" memory is reserved to the video core)

zyga@pi3-2:~$ free -m
               total        used        free      shared  buff/cache   available
Mem:             304         139          18           4         158         164
Swap:              0           0           0

zyga@pi3-2:~$ cat /etc/os-release 
NAME="Ubuntu Core"
VERSION="24"
ID=ubuntu-core
PRETTY_NAME="Ubuntu Core 24"
VERSION_ID="24"
HOME_URL="https://snapcraft.io/"
BUG_REPORT_URL="https://bugs.launchpad.net/snappy/"

zyga@pi3-2:~$ snap list
Name          Version                          Rev    Tracking       Publisher            Notes
checkbox      3.3.0-dev19                      5219   uc18/stable    ce-certification-qa  devmode
checkbox18    3.3.0-dev19                      3005   latest/stable  ce-certification-qa  -
console-conf  24.04.1+git45g5f9fae19+gd81a15d  41     24/stable      canonical✓           -
core18        20240416                         2826   latest/stable  canonical✓           base
core24        20240528                         424    latest/stable  canonical✓           base
pi            24-1                             142    24/stable      canonical✓           gadget
pi-kernel     6.8.0-1005.5                     852    24/stable      canonical✓           kernel
snapd         2.63                             21761  latest/stable  canonical✓           snapd

Even with this I'm able to run checkbox:

sudo checkbox.checkbox-cli run com.canonical.certification::sru-ubuntucore

(lots of output removed)

☑ : Collect information about the CPU
 ☐ : Attach detailed sysfs property output from udev
 ☑ : Create resource info for environment variables
 ☑ : Collect information about installed system (lsb-release)
 ☑ : Attach a copy of /proc/cmdline
 ☑ : Collect information about installed software packages
 ☑ : Attach dump of udev database
 ☑ : Collect information about system memory (/proc/meminfo)
 ☑ : Attach the contents of /etc/modprobe.*
 ☑ : Attach PCI configuration space hex dump
 ☑ : Collect information about the running kernel
 ☑ : Attaches json dumps of udev_resource.py
 ☑ : Provide links to requirements documents
 ☐ : Attaches json dumps of system info tools
 ☑ : Collect information about installed snap packages
 ☑ : Attaches json dumps of installed dkms package information.
 ☑ : Enumerate available system executables
 ☑ : Resource to detect if dmi data is present
 ☐ : Attaches json dumps of raw dmi devices
 ☑ : Collect information about installation media (casper)
 ☑ : Collect information about dpkg version
 ☑ : Collect information about kernel modules
 ☐ : Attach a copy of /sys/class/dmi/id/*
 ☑ : Collect information about hardware devices (udev)
 ☐ : Collect information about hardware devices (DMI)
 ☑ : Attach info block devices and their mount points
 ☑ : Collect information about the EFI configuration
 ☐ : Check that data for a complete result are present
 ☑ : acpi_sleep_attachment
 ☒ : codecs_attachment
 ☑ : Attach a copy of /proc/cpuinfo
 ☑ : Attach a copy of dmesg
 ☐ : Attach output of dmidecode
 ☑ : Attaches firmware version info
 ☑ : Attach a list of PCI devices
 ☑ : Attach copy of /proc/meminfo
 ☑ : Attach the contents of /etc/modprobe.*
 ☑ : Attach the contents of /etc/modules
 ☑ : Identify what service is managing each physical network interface
 ☑ : Collect logging from the net_if_management job
 ☑ : Attach sysctl configuration files.
 ☑ : Attach a list of currently running kernel modules
 ☑ : Hardware Manifest
 ☐ : Check that at least one audio capture device exists
 ☐ : Check that at least one audio playback device exists
 ☐ : Captured sound matches played one (automated)
 ☐ : bluetooth/detect-output
 ☑ : Test the CPU scaling capabilities
 ☑ : Attach CPU scaling capabilities log
 ☑ : cpu_offlining
 ☐ : Test offlining of each CPU core
 ☐ : Check CPU topology for accuracy between proc and sysfs
 ☐ : Automated test of SD Card reading & writing (udisk2)
 ☑ : Check amount of memory reported by meminfo against DMI
 ☐ : Detect if at least one ethernet device is detected
 ☑ : Gather info on current state of network devices
 ☒ : networking/http
 ☑ : Can ping another machine over Ethernet port enxb827eb9f89d1
 ☑ : Creates resource info for RTC
 ☐ : Test that RTC functions properly (if present)
 ☑ : power-management/fwts_wakealarm
 ☑ : power-management/fwts_wakealarm-log-attach
 ☑ : Display USB devices attached to SUT
 ☐ : Test USB 2.0 or 1.1 ports
 ☑ : Collect information about connections
 ☐ : Test system can discover Wi-Fi networks on wlan0
 ☐ : Connect to WPA-encrypted 802.11b/g Wi-Fi network on wlan0
 ☐ : Connect to unencrypted 802.11b/g Wi-Fi network on wlan0
 ☐ : Connect to WPA-encrypted 802.11n Wi-Fi network on wlan0
 ☐ : Connect to unencrypted 802.11n Wi-Fi network on wlan0
 ☑ : Resource job to identify Wi-Fi STA supported protocols
 ☐ : Connect to WPA-encrypted 802.11ac Wi-Fi network on wlan0
 ☐ : Connect to unencrypted 802.11ac Wi-Fi network on wlan0
 ☒ : Test system can get beacon EddyStone URL advertisements on the hci0 adapter
 ☑ : Verify that all the CPUs are online before suspending
 ☑ : Store memory info before suspending
 ☑ : Create resource info for supported sleep states
 ☐ : Automated test of suspend function
 ☐ : Automated check of the suspend log for errors reported by fwts
 ☐ : Attaches the log from the single suspend/resume test
 ☐ : Network reconnect resume test (wired)
 ☐ : Network reconnect resume test (wifi)
 ☐ : Test USB 2.0 or 1.1 ports after suspend (S3)
 ☐ : Captured sound matches played one (automated) after suspend (S3)
 ☐ : Test system can get beacon EddyStone URL advertisements on the hci0 adapter after suspend (S3)
 ☐ : Verify that all the CPUs are online after suspending
 ☐ : Compare memory info to the state prior to suspend
 ☐ : Connect to unencrypted 802.11b/g Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to unencrypted 802.11n Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to WPA-encrypted 802.11b/g Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to WPA-encrypted 802.11n Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to WPA-encrypted 802.11ac Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to unencrypted 802.11ac Wi-Fi network on wlan0 after suspend (S3)

Looking at the root -.slice, I can see that I still have lots of headroom:

zyga@pi3-2:~$ systemctl status -- -.slice | grep '[M]emory'
     Memory: 222.1M ()

As such I cannot yet repeat the crash you've seen.

Revision history for this message

Zygmunt Krynicki (zyga) wrote on 2024-06-10:

#9

Download full text (12.9 KiB)

My raspberry pi 3B+ no longer boots core24 with memory limit set to 395MB - (assuming 128MB is used by the videocore side, this leaves just 267MB of memory for all of Linux. I have nothing on the serial port but there's a backtrace on HDMI where plymouthd crashes somewhere in freetype.

With some more memory I was able to move to checkbox for core-24 from edge:

zyga@pi3-2:~$ snap list
Name Version Rev Tracking Publisher Notes
checkbox 4.0.0-dev297 7268 uc24/edge ce-certification-qa devmode
checkbox18 3.3.0-dev19 3005 latest/stable ce-certification-qa disabled
checkbox24 4.0.0-dev297 25 latest/edge ce-certification-qa -
console-conf 24.04.1+git45g5f9fae19+gd81a15d 41 24/stable canonical✓ -
core18 20240416 2826 latest/stable canonical✓ base
core24 20240528 424 latest/stable canonical✓ base
pi 24-1 142 24/stable canonical✓ gadget
pi-kernel 6.8.0-1005.5 852 24/stable canonical✓ kernel
snapd 2.63 21761 latest/stable canonical✓ snapd

Note that "some more memory" is still significantly less than 512MB:

zyga@pi3-2:~$ free -m
total used free shared buff/cache available
Mem: 299 150 57 4 104 148
Swap: 0 0 0

Again I limited memory with config.txt:

# XXX: artificially limit available memory to stress-test the system.
total_mem=395
gpu_mem=64

With this limit I can complete a checkbox run. While not the best overall estimate, the total memory needed in the system.slice was about 180MB (1024)

zyga@pi3-2:/sys/fs/cgroup$ cat system.slice/memory.peak
190345216

The result I've obtained:

Finalizing session that hasn't been submitted anywhere: checkbox-run-2024-06-10T08.58.22
==================================[ Results ]===================================
☑ : Collect information about system memory (/proc/meminfo)
☑ : Enumerate available system executables
☑ : Attach information about block devices and their mount points.
☑ : Collect information about the CPU
☐ : Attaches JSON dumps of system info tools
☑ : Provide links to requirements documents
☑ : Attach a copy of /proc/cmdline
☑ : Collect information about installed software packages
☑ : Resource to detect if dmi data is present
☐ : Attaches JSON dumps of raw DMI devices
☑ : Collect information about the running kernel
☑ : Collect information about the EFI configuration
☑ : Collect information about installed system (lsb-release)
☑ : Attaches json dumps of installed dkms package information.
☑ : Create resource info for environment variables
☐ : Attach detailed sysfs property output from udev
☑ : Collect information about hardware devices (udev)
☒ : Collect information about installed snap packages
☑ : Collect information about kernel modules
☑ : Attach PCI conf...

My raspberry pi 3B+ no longer boots core24 with memory limit set to 395MB - (assuming 128MB is used by the videocore side, this leaves just 267MB of memory for all of Linux. I have nothing on the serial port but there's a backtrace on HDMI where plymouthd crashes somewhere in freetype.

With some more memory I was able to move to checkbox for core-24 from edge:

zyga@pi3-2:~$ snap list
Name          Version                          Rev    Tracking       Publisher            Notes
checkbox      4.0.0-dev297                     7268   uc24/edge      ce-certification-qa  devmode
checkbox18    3.3.0-dev19                      3005   latest/stable  ce-certification-qa  disabled
checkbox24    4.0.0-dev297                     25     latest/edge    ce-certification-qa  -
console-conf  24.04.1+git45g5f9fae19+gd81a15d  41     24/stable      canonical✓           -
core18        20240416                         2826   latest/stable  canonical✓           base
core24        20240528                         424    latest/stable  canonical✓           base
pi            24-1                             142    24/stable      canonical✓           gadget
pi-kernel     6.8.0-1005.5                     852    24/stable      canonical✓           kernel
snapd         2.63                             21761  latest/stable  canonical✓           snapd

Note that "some more memory" is still significantly less than 512MB:

zyga@pi3-2:~$ free -m
               total        used        free      shared  buff/cache   available
Mem:             299         150          57           4         104         148
Swap:              0           0           0

Again I limited memory with config.txt:

# XXX: artificially limit available memory to stress-test the system.
total_mem=395
gpu_mem=64

With this limit I can complete a checkbox run. While not the best overall estimate, the total memory needed in the system.slice was about 180MB (1024)

zyga@pi3-2:/sys/fs/cgroup$ cat system.slice/memory.peak 
190345216

The result I've obtained:

Finalizing session that hasn't been submitted anywhere: checkbox-run-2024-06-10T08.58.22
==================================[ Results ]===================================
 ☑ : Collect information about system memory (/proc/meminfo)
 ☑ : Enumerate available system executables
 ☑ : Attach information about block devices and their mount points.
 ☑ : Collect information about the CPU
 ☐ : Attaches JSON dumps of system info tools
 ☑ : Provide links to requirements documents
 ☑ : Attach a copy of /proc/cmdline
 ☑ : Collect information about installed software packages
 ☑ : Resource to detect if dmi data is present
 ☐ : Attaches JSON dumps of raw DMI devices
 ☑ : Collect information about the running kernel
 ☑ : Collect information about the EFI configuration
 ☑ : Collect information about installed system (lsb-release)
 ☑ : Attaches json dumps of installed dkms package information.
 ☑ : Create resource info for environment variables
 ☐ : Attach detailed sysfs property output from udev
 ☑ : Collect information about hardware devices (udev)
 ☒ : Collect information about installed snap packages
 ☑ : Collect information about kernel modules
 ☑ : Attach PCI configuration space hex dump
 ☐ : Attach a copy of /sys/class/dmi/id/*
 ☐ : Collect information about hardware devices (DMI)
 ☑ : Collect information about dpkg version
 ☑ : Attaches JSON dumps of udev resource information.
 ☑ : Attach dump of udev database
 ☑ : Attach the contents of /etc/modprobe.*
 ☑ : Collect information about installation media (casper)
 ☐ : Check that data for a complete result are present
 ☑ : Attach the contents of /proc/acpi/sleep for further analysis.
 ☒ : Attach a report of installed codecs for Intel HDA.
 ☑ : Attach a copy of /proc/cpuinfo
 ☑ : Attach a copy of dmesg or the current dmesg buffer to the test results.
 ☐ : Attach output of dmidecode
 ☑ : Attaches firmware version info
 ☑ : Attach a list of PCI devices
 ☑ : Attach copy of /proc/meminfo
 ☑ : Attach the contents of /etc/modprobe.*
 ☑ : Attach the contents of /etc/modules
 ☑ : Identify what service is managing each physical network interface
 ☑ : Collect logging from the net_if_management job
 ☑ : Attach sysctl configuration files.
 ☑ : Attach a list of currently running kernel modules
 ☑ : Hardware Manifest
 ☐ : Check that at least one audio capture device exists
 ☐ : Check that at least one audio playback device exists
 ☐ : Captured sound matches played one (automated)
 ☐ : bluetooth/detect-output
 ☑ : Test the CPU scaling capabilities
 ☑ : Attach CPU scaling capabilities log
 ☑ : cpu_offlining
 ☐ : Test offlining of each CPU core
 ☐ : Check CPU topology for accuracy between proc and sysfs
 ☐ : Automated test of SD Card reading & writing (udisks2)
 ☑ : Check the amount of memory reported by meminfo against DMI
 ☐ : Detect if at least one ethernet device is detected
 ☑ : Gather info on the current state of network devices
 ☒ : Ensure downloading files through HTTP works correctly.
 ☑ : Can ping the gateway with any cable Ethernet interface
 ☑ : Creates resource info for RTC
 ☐ : Test that RTC functions properly (if present)
 ☑ : Executes ACPI Wakealarm test to validate functionality.
 ☑ : Attach and display fwts wakealarm test log.
 ☑ : Display USB devices attached to SUT
 ☐ : Test USB 2.0 or 1.1 ports
 ☒ : Collect information about connections
 ☐ : Test system can discover Wi-Fi networks on wlan0
 ☐ : Connect to WPA-encrypted 802.11b/g Wi-Fi network on wlan0
 ☐ : Connect to an unencrypted 802.11b/g Wi-Fi network on wlan0
 ☐ : Connect to a WPA-encrypted 802.11n Wi-Fi network on wlan0
 ☐ : Connect to an unencrypted 802.11n Wi-Fi network on wlan0
 ☑ : Resource job to identify Wi-Fi STA supported protocols
 ☐ : Connect to WPA-encrypted 802.11ac Wi-Fi network on wlan0
 ☐ : Connect to unencrypted 802.11ac Wi-Fi network on wlan0
 ☒ : Test system can get beacon EddyStone URL advertisements on the hci0 adapter
 ☑ : Verify that all the CPUs are online before suspending
 ☑ : Store memory info before suspending
 ☑ : Create resource info for supported sleep states
 ☐ : Automated test of suspend function
 ☐ : Automated check of the suspend log for errors reported by fwts
 ☐ : Attaches the log from the single suspend/resume test
 ☐ : Network reconnect resume test (wired)
 ☐ : Network reconnect resume test (wifi)
 ☐ : Test USB 2.0 or 1.1 ports after suspend (S3)
 ☐ : Captured sound matches played one (automated) after suspend (S3)
 ☐ : Test system can get beacon EddyStone URL advertisements on the hci0 adapter after suspend (S3)
 ☐ : Verify that all the CPUs are online after suspending
 ☐ : Compare memory info to the state prior to suspend
 ☐ : Connect to an unencrypted 802.11b/g Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to an unencrypted 802.11n Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to WPA-encrypted 802.11b/g Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to a WPA-encrypted 802.11n Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to WPA-encrypted 802.11ac Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to unencrypted 802.11ac Wi-Fi network on wlan0 after suspend (S3)

All of this was done with snapd 2.63. Reverting to snapd 2.62 I have the following results:

==================================[ Results ]===================================
 ☑ : Collect information about dpkg version
 ☑ : Collect information about the running kernel
 ☑ : Collect information about the CPU
 ☐ : Attaches JSON dumps of system info tools
 ☑ : Attaches json dumps of installed dkms package information.
 ☑ : Collect information about installation media (casper)
 ☑ : Create resource info for environment variables
 ☑ : Resource to detect if dmi data is present
 ☐ : Collect information about hardware devices (DMI)
 ☑ : Attach a copy of /proc/cmdline
 ☑ : Attach the contents of /etc/modprobe.*
 ☑ : Attach dump of udev database
 ☑ : Collect information about installed software packages
 ☑ : Collect information about system memory (/proc/meminfo)
 ☑ : Collect information about installed system (lsb-release)
 ☑ : Attach PCI configuration space hex dump
 ☐ : Attach detailed sysfs property output from udev
 ☒ : Collect information about installed snap packages
 ☑ : Enumerate available system executables
 ☑ : Attach information about block devices and their mount points.
 ☐ : Attaches JSON dumps of raw DMI devices
 ☑ : Provide links to requirements documents
 ☐ : Attach a copy of /sys/class/dmi/id/*
 ☑ : Collect information about kernel modules
 ☑ : Collect information about hardware devices (udev)
 ☑ : Attaches JSON dumps of udev resource information.
 ☑ : Collect information about the EFI configuration
 ☐ : Check that data for a complete result are present
 ☑ : Attach the contents of /proc/acpi/sleep for further analysis.
 ☒ : Attach a report of installed codecs for Intel HDA.
 ☑ : Attach a copy of /proc/cpuinfo
 ☑ : Attach a copy of dmesg or the current dmesg buffer to the test results.
 ☐ : Attach output of dmidecode
 ☑ : Attaches firmware version info
 ☑ : Attach a list of PCI devices
 ☑ : Attach copy of /proc/meminfo
 ☑ : Attach the contents of /etc/modprobe.*
 ☑ : Attach the contents of /etc/modules
 ☑ : Identify what service is managing each physical network interface
 ☑ : Collect logging from the net_if_management job
 ☑ : Attach sysctl configuration files.
 ☑ : Attach a list of currently running kernel modules
 ☑ : Hardware Manifest
 ☐ : Check that at least one audio capture device exists
 ☐ : Check that at least one audio playback device exists
 ☐ : Captured sound matches played one (automated)
 ☐ : bluetooth/detect-output
 ☑ : Test the CPU scaling capabilities
 ☑ : Attach CPU scaling capabilities log
 ☑ : cpu_offlining
 ☐ : Test offlining of each CPU core
 ☐ : Check CPU topology for accuracy between proc and sysfs
 ☐ : Automated test of SD Card reading & writing (udisks2)
 ☑ : Check the amount of memory reported by meminfo against DMI
 ☐ : Detect if at least one ethernet device is detected
 ☑ : Gather info on the current state of network devices
 ☒ : Ensure downloading files through HTTP works correctly.
 ☑ : Can ping the gateway with any cable Ethernet interface
 ☑ : Creates resource info for RTC
 ☐ : Test that RTC functions properly (if present)
 ☑ : Executes ACPI Wakealarm test to validate functionality.
 ☑ : Attach and display fwts wakealarm test log.
 ☑ : Display USB devices attached to SUT
 ☐ : Test USB 2.0 or 1.1 ports
 ☒ : Collect information about connections
 ☐ : Test system can discover Wi-Fi networks on wlan0
 ☐ : Connect to WPA-encrypted 802.11b/g Wi-Fi network on wlan0
 ☐ : Connect to an unencrypted 802.11b/g Wi-Fi network on wlan0
 ☐ : Connect to a WPA-encrypted 802.11n Wi-Fi network on wlan0
 ☐ : Connect to an unencrypted 802.11n Wi-Fi network on wlan0
 ☑ : Resource job to identify Wi-Fi STA supported protocols
 ☐ : Connect to WPA-encrypted 802.11ac Wi-Fi network on wlan0
 ☐ : Connect to unencrypted 802.11ac Wi-Fi network on wlan0
 ☒ : Test system can get beacon EddyStone URL advertisements on the hci0 adapter
 ☑ : Verify that all the CPUs are online before suspending
 ☑ : Store memory info before suspending
 ☑ : Create resource info for supported sleep states
 ☐ : Automated test of suspend function
 ☐ : Automated check of the suspend log for errors reported by fwts
 ☐ : Attaches the log from the single suspend/resume test
 ☐ : Network reconnect resume test (wired)
 ☐ : Network reconnect resume test (wifi)
 ☐ : Test USB 2.0 or 1.1 ports after suspend (S3)
 ☐ : Captured sound matches played one (automated) after suspend (S3)
 ☐ : Test system can get beacon EddyStone URL advertisements on the hci0 adapter after suspend (S3)
 ☐ : Verify that all the CPUs are online after suspending
 ☐ : Compare memory info to the state prior to suspend
 ☐ : Connect to an unencrypted 802.11b/g Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to an unencrypted 802.11n Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to WPA-encrypted 802.11b/g Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to a WPA-encrypted 802.11n Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to WPA-encrypted 802.11ac Wi-Fi network on wlan0 after suspend (S3)
 ☐ : Connect to unencrypted 802.11ac Wi-Fi network on wlan0 after suspend (S3)

The memory stats for that run as as follows. Freshly after booting:

zyga@pi3-2:~$ cat /sys/fs/cgroup/system.slice/memory.peak 
160915456

zyga@pi3-2:~$ cat /sys/fs/cgroup/system.slice/memory.peak 
199782400

To clarify, this is more memory than on 2.63. As such I think there's no clearly measurable difference where snapd 2.63 uses dramatically more memory than snapd 2.62 did.

One important caveat is that I was neither in the certification environment - some tests got skipped (like WIFI connection tests) nor was I using the same hardware (I was using Raspberry Pi 3B+ as 3B does not boot with core 18 due to u-boot, presumably). Given that I severely limited memory on my device, to much less than what the certification team was using, and given that I was able to complete the test on both version of snapd with roughly the same amount of memory I will now seek help from the cert team to debug what they are observing in their environment.

Revision history for this message

Zygmunt Krynicki (zyga) wrote on 2024-06-10:

#10

Download full text (17.0 KiB)

I've reproduced the failure:

[ 2174.310348] dmesg invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[ 2174.310349] dmesg cpuset=/ mems_allowed=0
[ 2174.310362] CPU: 0 PID: 2226 Comm: dmesg Not tainted 4.15.0-225-generic #237-Ubuntu
[ 2174.310363] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 2024.02-2 03/11/2024
[ 2174.310363] Call Trace:
[ 2174.310392] dump_stack+0x6d/0x8b
[ 2174.310395] dump_header+0x71/0x282
[ 2174.310398] ? ___ratelimit+0x9c/0x100
[ 2174.310410] oom_kill_process+0x21f/0x420
[ 2174.310411] out_of_memory+0x116/0x4e0
[ 2174.310412] __alloc_pages_slowpath+0xa3d/0xe70
[ 2174.310417] ? alloc_pages_current+0x6a/0xe0
[ 2174.310418] __alloc_pages_nodemask+0x29a/0x2c0
[ 2174.310419] alloc_pages_current+0x6a/0xe0
[ 2174.310422] __page_cache_alloc+0x81/0xa0
[ 2174.310423] filemap_fault+0x42f/0x750
[ 2174.310424] ? filemap_map_pages+0x181/0x390
[ 2174.310427] __do_fault+0x34/0x100
[ 2174.310428] __handle_mm_fault+0x982/0xc50
[ 2174.310429] handle_mm_fault+0xe7/0x260
[ 2174.310439] __do_page_fault+0x281/0x4b0
[ 2174.310440] ? __schedule+0x256/0x890
[ 2174.310441] do_page_fault+0x2e/0xe0
[ 2174.310444] ? async_page_fault+0x2f/0x50
[ 2174.310449] do_async_page_fault+0x51/0x80
[ 2174.310449] async_page_fault+0x45/0x50
[ 2174.310453] RIP: 0033:0x7f419bfeb100
[ 2174.310454] RSP: 002b:00007ffd5f9b4cf8 EFLAGS: 00010246
[ 2174.310477] RAX: 000055930a25d169 RBX: 00007f419c31c760 RCX: 000055930a25d16a
[ 2174.310477] RDX: 0000000000000001 RSI: 000055930a25d169 RDI: 000055930b43a8a7
[ 2174.310478] RBP: 0000000000000001 R08: 00007f419c31c760 R09: 0000000000000000
[ 2174.310478] R10: 0000000000000005 R11: 000055930a058f02 R12: 0000000000000001
[ 2174.310478] R13: 000055930a25d16a R14: 0000000000000000 R15: 0000000000000000
[ 2174.310481] Mem-Info:
[ 2174.310483] active_anon:32340 inactive_anon:996 isolated_anon:0
                active_file:236 inactive_file:281 isolated_file:0
                unevictable:0 dirty:0 writeback:0 unstable:0
                slab_reclaimable:3321 slab_unreclaimable:13715
                mapped:37 shmem:1375 pagetables:1063 bounce:0
                free:637 free_pcp:6 free_cma:0
[ 2174.310486] Node 0 active_anon:129360kB inactive_anon:3984kB active_file:944kB inactive_file:1124kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:148kB dirty:0kB writeback:0kB shmem:5500kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[ 2174.310487] Node 0 DMA free:868kB min:128kB low:160kB high:192kB active_anon:8228kB inactive_anon:608kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15000kB managed:14912kB mlocked:0kB kernel_stack:32kB pagetables:92kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 2174.310490] lowmem_reserve[]: 0 185 185 185 185
[ 2174.310491] Node 0 DMA32 free:1680kB min:1680kB low:2100kB high:2520kB active_anon:121132kB inactive_anon:3376kB active_file:944kB inactive_file:1124kB unevictable:0kB writepending:0kB present:240772kB managed:213444kB mlocked:0kB kernel_stack:1648kB pagetables:4160kB bounce:0kB free_pcp:24kB local_pcp:2...

I've reproduced the failure:

[ 2174.310348] dmesg invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[ 2174.310349] dmesg cpuset=/ mems_allowed=0
[ 2174.310362] CPU: 0 PID: 2226 Comm: dmesg Not tainted 4.15.0-225-generic #237-Ubuntu
[ 2174.310363] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 2024.02-2 03/11/2024
[ 2174.310363] Call Trace:
[ 2174.310392]  dump_stack+0x6d/0x8b
[ 2174.310395]  dump_header+0x71/0x282
[ 2174.310398]  ? ___ratelimit+0x9c/0x100
[ 2174.310410]  oom_kill_process+0x21f/0x420
[ 2174.310411]  out_of_memory+0x116/0x4e0
[ 2174.310412]  __alloc_pages_slowpath+0xa3d/0xe70
[ 2174.310417]  ? alloc_pages_current+0x6a/0xe0
[ 2174.310418]  __alloc_pages_nodemask+0x29a/0x2c0
[ 2174.310419]  alloc_pages_current+0x6a/0xe0
[ 2174.310422]  __page_cache_alloc+0x81/0xa0
[ 2174.310423]  filemap_fault+0x42f/0x750
[ 2174.310424]  ? filemap_map_pages+0x181/0x390
[ 2174.310427]  __do_fault+0x34/0x100
[ 2174.310428]  __handle_mm_fault+0x982/0xc50
[ 2174.310429]  handle_mm_fault+0xe7/0x260
[ 2174.310439]  __do_page_fault+0x281/0x4b0
[ 2174.310440]  ? __schedule+0x256/0x890
[ 2174.310441]  do_page_fault+0x2e/0xe0
[ 2174.310444]  ? async_page_fault+0x2f/0x50
[ 2174.310449]  do_async_page_fault+0x51/0x80
[ 2174.310449]  async_page_fault+0x45/0x50
[ 2174.310453] RIP: 0033:0x7f419bfeb100
[ 2174.310454] RSP: 002b:00007ffd5f9b4cf8 EFLAGS: 00010246
[ 2174.310477] RAX: 000055930a25d169 RBX: 00007f419c31c760 RCX: 000055930a25d16a
[ 2174.310477] RDX: 0000000000000001 RSI: 000055930a25d169 RDI: 000055930b43a8a7
[ 2174.310478] RBP: 0000000000000001 R08: 00007f419c31c760 R09: 0000000000000000
[ 2174.310478] R10: 0000000000000005 R11: 000055930a058f02 R12: 0000000000000001
[ 2174.310478] R13: 000055930a25d16a R14: 0000000000000000 R15: 0000000000000000
[ 2174.310481] Mem-Info:
[ 2174.310483] active_anon:32340 inactive_anon:996 isolated_anon:0
                active_file:236 inactive_file:281 isolated_file:0
                unevictable:0 dirty:0 writeback:0 unstable:0
                slab_reclaimable:3321 slab_unreclaimable:13715
                mapped:37 shmem:1375 pagetables:1063 bounce:0
                free:637 free_pcp:6 free_cma:0
[ 2174.310486] Node 0 active_anon:129360kB inactive_anon:3984kB active_file:944kB inactive_file:1124kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:148kB dirty:0kB writeback:0kB shmem:5500kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[ 2174.310487] Node 0 DMA free:868kB min:128kB low:160kB high:192kB active_anon:8228kB inactive_anon:608kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15000kB managed:14912kB mlocked:0kB kernel_stack:32kB pagetables:92kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 2174.310490] lowmem_reserve[]: 0 185 185 185 185
[ 2174.310491] Node 0 DMA32 free:1680kB min:1680kB low:2100kB high:2520kB active_anon:121132kB inactive_anon:3376kB active_file:944kB inactive_file:1124kB unevictable:0kB writepending:0kB present:240772kB managed:213444kB mlocked:0kB kernel_stack:1648kB pagetables:4160kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB
[ 2174.310494] lowmem_reserve[]: 0 0 0 0 0
[ 2174.310495] Node 0 DMA: 167*4kB (UME) 21*8kB (UME) 2*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 868kB
[ 2174.310498] Node 0 DMA32: 224*4kB (M) 0*8kB 1*16kB (U) 10*32kB (U) 7*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1680kB
[ 2174.310503] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2174.310503] 1894 total pagecache pages
[ 2174.310505] 0 pages in swap cache
[ 2174.310506] Swap cache stats: add 0, delete 0, find 0/0
[ 2174.310506] Free swap  = 0kB
[ 2174.310506] Total swap = 0kB
[ 2174.310507] 63943 pages RAM
[ 2174.310507] 0 pages HighMem/MovableOnly
[ 2174.310507] 6854 pages reserved
[ 2174.310507] 0 pages cma reserved
[ 2174.310508] 0 pages hwpoisoned
[ 2174.310508] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 2174.310510] [  448]     0   448    19706      167   167936        0             0 systemd-journal
[ 2174.310511] [  478]     0   478     8537      343   122880        0         -1000 systemd-udevd
[ 2174.310511] [  507]   104   507    19981      158   176128        0             0 systemd-network
[ 2174.310512] [  560]   105   560    17624      142   180224        0             0 systemd-resolve
[ 2174.310513] [  561]   103   561    35447      126   184320        0             0 systemd-timesyn
[ 2174.310513] [  632]   100   632    12454      149   135168        0          -900 dbus-daemon
[ 2174.310514] [  633]     0   633    11310      143   126976        0             0 wpa_supplicant
[ 2174.310515] [  646]     0   646    18135      168   176128        0             0 systemd-logind
[ 2174.310515] [  669]     0   669    27525     8529   258048        0             0 python3
[ 2174.310516] [  695]     0   695    18076      181   180224        0         -1000 sshd
[ 2174.310516] [  754]     0   754    19581      143   196608        0             0 login
[ 2174.310517] [  755]     0   755     3313       34    77824        0             0 agetty
[ 2174.310518] [  766]     0   766    26446      251   245760        0             0 sshd
[ 2174.310518] [  768]  1000   768    19264      346   188416        0             0 systemd
[ 2174.310526] [  769]  1000   769    48400      614   262144        0             0 (sd-pam)
[ 2174.310527] [  780]  1000   780    26446      253   237568        0             0 sshd
[ 2174.310527] [  781]  1000   781     5795      309    94208        0             0 bash
[ 2174.310528] [ 2193]     0  2193    26446      252   249856        0             0 sshd
[ 2174.310529] [ 2202]  1000  2202    26446      254   241664        0             0 sshd
[ 2174.310529] [ 2203]  1000  2203     5795      309    86016        0             0 bash
[ 2174.310530] [ 2226]  1000  2226     3200       44    69632        0             0 dmesg
[ 2174.310530] [ 2403]  1000  2403     5823      361    90112        0             0 bash
[ 2174.310531] [ 2436]  1000  2436    15458      122   151552        0             0 systemd-cgtop
[ 2174.310532] [ 2559]     0  2559   314660     3453   258048        0          -900 snapd
[ 2174.310532] [ 2769]     0  2769    15031      123   163840        0             0 sudo
[ 2174.310533] [ 2770]     0  2770   107873    14669   344064        0             0 python3
[ 2174.310533] [ 3154]     0  3154     2952       84    65536        0             0 bash
[ 2174.310534] [ 3726]     0  3726     1777       14    49152        0             0 udevadm
[ 2174.310535] [ 3727]     0  3727     3363       23    61440        0             0 sed
[ 2174.310535] Out of memory: Kill process 2770 (python3) score 250 or sacrifice child
[ 2174.312598] Killed process 3154 (bash) total-vm:11808kB, anon-rss:336kB, file-rss:0kB, shmem-rss:0kB
[ 2174.315011] audit: type=1400 audit(1718029659.207:350485): apparmor="ALLOWED" operation="capable" info="optional: no audit" error=-1 profile="snap.checkbox.agent" pid=2226 comm="dmesg" capability=21  capname="sys_admin"
[ 2174.315012] audit: type=1400 audit(1718029659.207:350486): apparmor="ALLOWED" operation="capable" info="optional: no audit" error=-1 profile="snap.checkbox.checkbox-cli" pid=2226 comm="dmesg" capability=21  capname="sys_admin"
[ 2174.324786] oom_reaper: reaped process 3154 (bash), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 2174.538675] audit: type=1326 audit(1718029659.435:350487): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=9 compat=0 ip=0x7fd975beb123 code=0x7ffc0000
[ 2174.538698] audit: type=1326 audit(1718029659.435:350488): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=10 compat=0 ip=0x7fd975beb1d7 code=0x7ffc0000
[ 2174.538707] audit: type=1326 audit(1718029659.435:350489): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=9 compat=0 ip=0x7fd975beb123 code=0x7ffc0000
[ 2180.862174] systemd-logind invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[ 2180.862176] systemd-logind cpuset=/ mems_allowed=0
[ 2180.862179] CPU: 0 PID: 646 Comm: systemd-logind Not tainted 4.15.0-225-generic #237-Ubuntu
[ 2180.862180] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 2024.02-2 03/11/2024
[ 2180.862180] Call Trace:
[ 2180.862186]  dump_stack+0x6d/0x8b
[ 2180.862187]  dump_header+0x71/0x282
[ 2180.862189]  ? ___ratelimit+0x9c/0x100
[ 2180.862191]  oom_kill_process+0x21f/0x420
[ 2180.862192]  out_of_memory+0x116/0x4e0
[ 2180.862193]  __alloc_pages_slowpath+0xa3d/0xe70
[ 2180.862195]  ? alloc_pages_current+0x6a/0xe0
[ 2180.862196]  __alloc_pages_nodemask+0x29a/0x2c0
[ 2180.862197]  alloc_pages_current+0x6a/0xe0
[ 2180.862198]  __page_cache_alloc+0x81/0xa0
[ 2180.862199]  filemap_fault+0x42f/0x750
[ 2180.862200]  ? filemap_map_pages+0x181/0x390
[ 2180.862201]  __do_fault+0x34/0x100
[ 2180.862202]  __handle_mm_fault+0x982/0xc50
[ 2180.862203]  handle_mm_fault+0xe7/0x260
[ 2180.862205]  __do_page_fault+0x281/0x4b0
[ 2180.862210]  ? wake_up_q+0x80/0x80
[ 2180.862211]  do_page_fault+0x2e/0xe0
[ 2180.862213]  ? async_page_fault+0x2f/0x50
[ 2180.862214]  do_async_page_fault+0x51/0x80
[ 2180.862235]  async_page_fault+0x45/0x50
[ 2180.862237] RIP: 0033:0x7f82dba41907
[ 2180.862238] RSP: 002b:00007ffc9c417c68 EFLAGS: 00010246
[ 2180.862238] RAX: 0000000000000001 RBX: 00005602500edef0 RCX: 00007f82dba41907
[ 2180.862239] RDX: 000000000000000f RSI: 00007ffc9c417c70 RDI: 0000000000000004
[ 2180.862239] RBP: 00007ffc9c417e30 R08: 00007ffc9c417c70 R09: 2e33623336656534
[ 2180.862240] R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000000286
[ 2180.862240] R13: ffffffffffffffff R14: 00007ffc9c417c70 R15: 0000000000000001
[ 2180.862241] Mem-Info:
[ 2180.862243] active_anon:32260 inactive_anon:996 isolated_anon:0
                active_file:214 inactive_file:291 isolated_file:32
                unevictable:0 dirty:0 writeback:0 unstable:0
                slab_reclaimable:3321 slab_unreclaimable:13727
                mapped:14 shmem:1375 pagetables:1054 bounce:0
                free:637 free_pcp:6 free_cma:0
[ 2180.862244] Node 0 active_anon:129040kB inactive_anon:3984kB active_file:856kB inactive_file:1164kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:56kB dirty:0kB writeback:0kB shmem:5500kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[ 2180.862245] Node 0 DMA free:868kB min:128kB low:160kB high:192kB active_anon:8228kB inactive_anon:608kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15000kB managed:14912kB mlocked:0kB kernel_stack:32kB pagetables:92kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 2180.862247] lowmem_reserve[]: 0 185 185 185 185
[ 2180.862248] Node 0 DMA32 free:1680kB min:1680kB low:2100kB high:2520kB active_anon:120812kB inactive_anon:3376kB active_file:856kB inactive_file:1164kB unevictable:0kB writepending:0kB present:240772kB managed:213444kB mlocked:0kB kernel_stack:1632kB pagetables:4124kB bounce:0kB free_pcp:24kB local_pcp:24kB free_cma:0kB
[ 2180.862250] lowmem_reserve[]: 0 0 0 0 0
[ 2180.862251] Node 0 DMA: 167*4kB (UME) 21*8kB (UME) 2*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 868kB
[ 2180.862254] Node 0 DMA32: 224*4kB (UM) 2*8kB (M) 0*16kB 10*32kB (M) 7*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1680kB
[ 2180.862257] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2180.862257] 1914 total pagecache pages
[ 2180.862258] 0 pages in swap cache
[ 2180.862258] Swap cache stats: add 0, delete 0, find 0/0
[ 2180.862259] Free swap  = 0kB
[ 2180.862259] Total swap = 0kB
[ 2180.862259] 63943 pages RAM
[ 2180.862260] 0 pages HighMem/MovableOnly
[ 2180.862260] 6854 pages reserved
[ 2180.862260] 0 pages cma reserved
[ 2180.862260] 0 pages hwpoisoned
[ 2180.862261] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 2180.862265] [  448]     0   448    19706      167   167936        0             0 systemd-journal
[ 2180.862266] [  478]     0   478     8537      343   122880        0         -1000 systemd-udevd
[ 2180.862267] [  507]   104   507    19981      158   176128        0             0 systemd-network
[ 2180.862268] [  560]   105   560    17624      142   180224        0             0 systemd-resolve
[ 2180.862269] [  561]   103   561    35447      126   184320        0             0 systemd-timesyn
[ 2180.862270] [  632]   100   632    12454      149   135168        0          -900 dbus-daemon
[ 2180.862270] [  633]     0   633    11310      143   126976        0             0 wpa_supplicant
[ 2180.862271] [  646]     0   646    18135      168   176128        0             0 systemd-logind
[ 2180.862271] [  669]     0   669    27525     8529   258048        0             0 python3
[ 2180.862272] [  695]     0   695    18076      181   180224        0         -1000 sshd
[ 2180.862273] [  754]     0   754    19581      143   196608        0             0 login
[ 2180.862274] [  755]     0   755     3313       34    77824        0             0 agetty
[ 2180.862274] [  766]     0   766    26446      251   245760        0             0 sshd
[ 2180.862282] [  768]  1000   768    19264      346   188416        0             0 systemd
[ 2180.862283] [  769]  1000   769    48400      614   262144        0             0 (sd-pam)
[ 2180.862284] [  780]  1000   780    26446      253   237568        0             0 sshd
[ 2180.862284] [  781]  1000   781     5795      309    94208        0             0 bash
[ 2180.862285] [ 2193]     0  2193    26446      252   249856        0             0 sshd
[ 2180.862286] [ 2202]  1000  2202    26446      254   241664        0             0 sshd
[ 2180.862287] [ 2203]  1000  2203     5795      309    86016        0             0 bash
[ 2180.862287] [ 2226]  1000  2226     3200       44    69632        0             0 dmesg
[ 2180.862288] [ 2403]  1000  2403     5823      361    90112        0             0 bash
[ 2180.862289] [ 2436]  1000  2436    15458      122   151552        0             0 systemd-cgtop
[ 2180.862289] [ 2559]     0  2559   314660     3453   258048        0          -900 snapd
[ 2180.862290] [ 2769]     0  2769    15031      123   163840        0             0 sudo
[ 2180.862291] [ 2770]     0  2770   107873    14669   344064        0             0 python3
[ 2180.862291] [ 3726]     0  3726     2297       15    53248        0             0 udevadm
[ 2180.862292] [ 3727]     0  3727     3363       26    61440        0             0 sed
[ 2180.862293] Out of memory: Kill process 2770 (python3) score 250 or sacrifice child
[ 2180.864342] Killed process 2770 (python3) total-vm:431492kB, anon-rss:58676kB, file-rss:0kB, shmem-rss:0kB
[ 2180.868824] kauditd_printk_skb: 37 callbacks suppressed
[ 2180.868825] audit: type=1400 audit(1718029665.756:350527): apparmor="ALLOWED" operation="capable" info="optional: no audit" error=-1 profile="snap.checkbox.agent" pid=646 comm="systemd-logind" capability=21  capname="sys_admin"
[ 2180.868827] audit: type=1400 audit(1718029665.756:350528): apparmor="ALLOWED" operation="capable" info="optional: no audit" error=-1 profile="snap.checkbox.checkbox-cli" pid=646 comm="systemd-logind" capability=21  capname="sys_admin"
[ 2180.876987] oom_reaper: reaped process 2770 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 2181.011360] audit: type=1326 audit(1718029665.904:350529): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=257 compat=0 ip=0x7fd975beaebd code=0x7ffc0000
[ 2181.100489] audit: type=1326 audit(1718029665.992:350530): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=0 compat=0 ip=0x7fd975beaf84 code=0x7ffc0000
[ 2181.110703] audit: type=1326 audit(1718029666.004:350531): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=5 compat=0 ip=0x7fd975beae23 code=0x7ffc0000
[ 2181.143486] audit: type=1326 audit(1718029666.036:350532): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=9 compat=0 ip=0x7fd975beb123 code=0x7ffc0000
[ 2181.143502] audit: type=1326 audit(1718029666.036:350533): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=10 compat=0 ip=0x7fd975beb1d7 code=0x7ffc0000
[ 2181.143532] audit: type=1326 audit(1718029666.036:350534): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=9 compat=0 ip=0x7fd975beb123 code=0x7ffc0000
[ 2181.174440] audit: type=1326 audit(1718029666.068:350535): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=9 compat=0 ip=0x7fd975beb123 code=0x7ffc0000
[ 2181.174453] audit: type=1326 audit(1718029666.068:350536): auid=1000 uid=0 gid=0 ses=4 pid=3726 comm="udevadm" exe="/bin/udevadm" sig=0 arch=c000003e syscall=3 compat=0 ip=0x7fd975beb0b7 code=0x7ffc0000

Changed in snapd:
importance:	Undecided → High

Revision history for this message

Zygmunt Krynicki (zyga) wrote on 2024-06-10:

#11

I'm working on bisecting the failure. Having said that the system _is_ running on fumes in terms of RAM. Checkbox is using a good fraction of available system RAM when the failure occurrs (both python processes are checkbox)

Zygmunt Krynicki (zyga) 45 minutes ago

Changed in snapd:
status:	In Progress → Fix Committed

snapd

New oom-killer related crash for low RAM UC devices

Bug Description

Other bug subscribers

Remote bug watches