fanctl causes hard system lockup with wrong subnet mask

Bug #2031553 reported by Alastair Flynn
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-fan (Ubuntu)
New
Undecided
Unassigned

Bug Description

Running the command:

sudo fanctl up 240.0.0.0/8 192.168.0.0/24

Triggers an error in the syslog. The whole OS then freezes minutes later. This is error message in the syslog:

```
2023-08-16T16:30:49.491095+01:00 aflynn-T14 kernel: [ 326.049553] ------------[ cut here ]------------
2023-08-16T16:30:49.491120+01:00 aflynn-T14 kernel: [ 326.049561] Voluntary context switch within RCU read-side critical section!
2023-08-16T16:30:49.491122+01:00 aflynn-T14 kernel: [ 326.049573] WARNING: CPU: 8 PID: 2708 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x2a7/0x2f0
2023-08-16T16:30:49.491124+01:00 aflynn-T14 kernel: [ 326.049590] Modules linked in: xt_nat nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c nfnetlink vxlan ip6_udp_tunnel udp_tunnel bridge stp llc ccm rfcomm cmac algif_hash algif_skcipher af_alg snd_seq_dummy snd_hrtimer bnep binfmt_misc nls_iso8859_1 snd_ctl_led snd_soc_dmic snd_acp3x_pdm_dma snd_acp3x_rn snd_hda_codec_realtek snd_sof_amd_rembrandt snd_sof_amd_renoir snd_hda_codec_generic snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_hdmi snd_sof snd_hda_intel snd_sof_utils snd_intel_dspcfg snd_intel_sdw_acpi snd_soc_core snd_hda_codec snd_compress thinkpad_acpi snd_hda_core btusb ac97_bus snd_pcm_dmaengine btrtl nvram snd_hwdep btbcm snd_pci_ps intel_rapl_msr snd_rpl_pci_acp6x btintel intel_rapl_common mt7921e snd_acp_pci uvcvideo btmtk snd_pci_acp6x snd_seq_midi edac_mce_amd snd_seq_midi_event mt7921_common videobuf2_vmalloc snd_rawmidi snd_pcm bluetooth videobuf2_memops mt76_connac_lib videobuf2_v4l2 kvm_amd snd_seq mt76
2023-08-16T16:30:49.491128+01:00 aflynn-T14 kernel: [ 326.049723] ecdh_generic snd_pci_acp5x ecc videobuf2_common tps6598x snd_seq_device kvm mac80211 snd_rn_pci_acp3x snd_timer irqbypass snd_acp_config rapl snd snd_soc_acpi think_lmi soundcore firmware_attributes_class cfg80211 ccp wmi_bmof snd_pci_acp3x ipmi_devintf ledtrig_audio libarc4 platform_profile k10temp ipmi_msghandler joydev serial_multi_instantiate amd_pmc input_leds mac_hid serio_raw v4l2loopback(O) videodev mc msr parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 dm_crypt amdgpu iommu_v2 drm_buddy gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_display_helper cec rc_core drm_kms_helper syscopyarea sysfillrect rtsx_pci_sdmmc sysimgblt crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd drm nvme psmouse r8169 rtsx_pci xhci_pci nvme_core xhci_pci_renesas i2c_piix4 ucsi_acpi typec_ucsi video realtek nvme_common typec wmi i2c_scmi
2023-08-16T16:30:49.491131+01:00 aflynn-T14 kernel: [ 326.049867] CPU: 8 PID: 2708 Comm: syncthing Tainted: G O 6.2.0-27-generic #28-Ubuntu
2023-08-16T16:30:49.491132+01:00 aflynn-T14 kernel: [ 326.049873] Hardware name: LENOVO 20UES27B01/20UES27B01, BIOS R1BET75W(1.44 ) 06/13/2023
2023-08-16T16:30:49.491134+01:00 aflynn-T14 kernel: [ 326.049877] RIP: 0010:rcu_note_context_switch+0x2a7/0x2f0
2023-08-16T16:30:49.491136+01:00 aflynn-T14 kernel: [ 326.049884] Code: 08 f0 83 44 24 fc 00 48 89 de 4c 89 f7 e8 31 c6 ff ff e9 1e fe ff ff 48 c7 c7 98 a4 76 ab c6 05 ee ad 3f 02 01 e8 b9 0b f3 ff <0f> 0b e9 bd fd ff ff a9 ff ff ff 7f 0f 84 75 fe ff ff 65 48 8b 3c
2023-08-16T16:30:49.491138+01:00 aflynn-T14 kernel: [ 326.049888] RSP: 0018:ffffa7c9c22f3c38 EFLAGS: 00010046
2023-08-16T16:30:49.491140+01:00 aflynn-T14 kernel: [ 326.049893] RAX: 0000000000000000 RBX: ffff94abbfa32e40 RCX: 0000000000000000
2023-08-16T16:30:49.491141+01:00 aflynn-T14 kernel: [ 326.049897] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
2023-08-16T16:30:49.491143+01:00 aflynn-T14 kernel: [ 326.049899] RBP: ffffa7c9c22f3c58 R08: 0000000000000000 R09: 0000000000000000
2023-08-16T16:30:49.491170+01:00 aflynn-T14 kernel: [ 326.049902] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
2023-08-16T16:30:49.491172+01:00 aflynn-T14 kernel: [ 326.049904] R13: ffff94a8d09ab300 R14: 0000000000000000 R15: 000000c001480148
2023-08-16T16:30:49.491173+01:00 aflynn-T14 kernel: [ 326.049908] FS: 000000c001480090(0000) GS:ffff94abbfa00000(0000) knlGS:0000000000000000
2023-08-16T16:30:49.491175+01:00 aflynn-T14 kernel: [ 326.049912] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2023-08-16T16:30:49.491178+01:00 aflynn-T14 kernel: [ 326.049916] CR2: 00007f4c646325d8 CR3: 000000011d816000 CR4: 0000000000350ee0
2023-08-16T16:30:49.491179+01:00 aflynn-T14 kernel: [ 326.049920] Call Trace:
2023-08-16T16:30:49.491181+01:00 aflynn-T14 kernel: [ 326.049923] <TASK>
2023-08-16T16:30:49.491183+01:00 aflynn-T14 kernel: [ 326.049930] __schedule+0xcc/0x610
2023-08-16T16:30:49.491185+01:00 aflynn-T14 kernel: [ 326.049942] schedule+0x63/0x110
2023-08-16T16:30:49.491187+01:00 aflynn-T14 kernel: [ 326.049949] futex_wait_queue+0x66/0xa0
2023-08-16T16:30:49.491188+01:00 aflynn-T14 kernel: [ 326.049957] futex_wait+0x177/0x270
2023-08-16T16:30:49.491190+01:00 aflynn-T14 kernel: [ 326.049962] ? __sys_sendto+0x199/0x1b0
2023-08-16T16:30:49.491192+01:00 aflynn-T14 kernel: [ 326.049976] do_futex+0x151/0x200
2023-08-16T16:30:49.491194+01:00 aflynn-T14 kernel: [ 326.049982] __x64_sys_futex+0x95/0x200
2023-08-16T16:30:49.491195+01:00 aflynn-T14 kernel: [ 326.049988] ? __secure_computing+0x89/0xf0
2023-08-16T16:30:49.491197+01:00 aflynn-T14 kernel: [ 326.049996] do_syscall_64+0x5b/0x90
2023-08-16T16:30:49.491198+01:00 aflynn-T14 kernel: [ 326.050004] ? exit_to_user_mode_prepare+0x30/0xb0
2023-08-16T16:30:49.491200+01:00 aflynn-T14 kernel: [ 326.050013] ? syscall_exit_to_user_mode+0x29/0x50
2023-08-16T16:30:49.491201+01:00 aflynn-T14 kernel: [ 326.050019] ? do_syscall_64+0x67/0x90
2023-08-16T16:30:49.491202+01:00 aflynn-T14 kernel: [ 326.050024] ? do_syscall_64+0x67/0x90
2023-08-16T16:30:49.491204+01:00 aflynn-T14 kernel: [ 326.050030] entry_SYSCALL_64_after_hwframe+0x72/0xdc
2023-08-16T16:30:49.491205+01:00 aflynn-T14 kernel: [ 326.050036] RIP: 0033:0x46f023
2023-08-16T16:30:49.491207+01:00 aflynn-T14 kernel: [ 326.050041] Code: 24 20 c3 cc cc cc cc 48 8b 7c 24 08 8b 74 24 10 8b 54 24 14 4c 8b 54 24 18 4c 8b 44 24 20 44 8b 4c 24 28 b8 ca 00 00 00 0f 05 <89> 44 24 30 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
2023-08-16T16:30:49.491209+01:00 aflynn-T14 kernel: [ 326.050045] RSP: 002b:000000c00140dd90 EFLAGS: 00000286 ORIG_RAX: 00000000000000ca
2023-08-16T16:30:49.491210+01:00 aflynn-T14 kernel: [ 326.050050] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000046f023
2023-08-16T16:30:49.491211+01:00 aflynn-T14 kernel: [ 326.050053] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000c001480148
2023-08-16T16:30:49.491213+01:00 aflynn-T14 kernel: [ 326.050056] RBP: 000000c00140ddd8 R08: 0000000000000000 R09: 0000000000000000
2023-08-16T16:30:49.491214+01:00 aflynn-T14 kernel: [ 326.050058] R10: 0000000000000000 R11: 0000000000000286 R12: 000000c00140dde8
2023-08-16T16:30:49.491216+01:00 aflynn-T14 kernel: [ 326.050061] R13: 0000000000000001 R14: 000000c0014821a0 R15: 0000000000000001
2023-08-16T16:30:49.491217+01:00 aflynn-T14 kernel: [ 326.050070] </TASK>
2023-08-16T16:30:49.491219+01:00 aflynn-T14 kernel: [ 326.050072] ---[ end trace 0000000000000000 ]---
```

The final lines printed to the log before it stops are:
```
2023-08-16T16:55:55.394081+01:00 aflynn-T14 kernel: [ 1356.862270] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P2838 } 68966 jiffies s: 1517 root: 0x0/T
2023-08-16T16:55:55.394105+01:00 aflynn-T14 kernel: [ 1356.862295] rcu: blocking rcu_node structures (internal RCU debug):
```

ProblemType: Bug
DistroRelease: Ubuntu 23.04
Package: ubuntu-fan 0.12.16
ProcVersionSignature: Ubuntu 6.2.0-27.28-generic 6.2.15
Uname: Linux 6.2.0-27-generic x86_64
ApportVersion: 2.26.1-0ubuntu2
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Wed Aug 16 17:02:24 2023
InstallationDate: Installed on 2023-07-31 (15 days ago)
InstallationMedia: Ubuntu 23.04 "Lunar Lobster" - Release amd64 (20230418)
PackageArchitecture: all
ProcEnviron:
 LANG=en_US.UTF-8
 PATH=(custom, no user)
 SHELL=/usr/bin/zsh
 TERM=xterm-256color
 XDG_RUNTIME_DIR=<set>
SourcePackage: ubuntu-fan
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Alastair Flynn (aflynn50) wrote :
Revision history for this message
Alastair Flynn (aflynn50) wrote :

On further investigation this appears to be caused by the fact that I have used a subnet with mask 24. In the man page Limitations section there is a note saying this is not supported. However, there is no error message at all and it results in a hard system crash. It would be good if this were not there case and there could at least be a warning.

This bug also appears on 22.04.

Revision history for this message
John A Meinel (jameinel) wrote :

I certainly don't think it should be possible for a user configuration to cause a hard crash. If it "isn't supported" then it should be rejected in some fashion, rather than accepted, but then causing system instability.

summary: - fanctl causes hard system crash in ubuntu 23.04
+ fanctl causes hard system crash with wrong subnet mask
Revision history for this message
Stefan Bader (smb) wrote :

I changed the title since this is not exactly a crash. The message shown is on its own only a warning. However one that does not sound good. Allowing to interrupt execution of a critical section very likely gets one into trouble.

When you say the bug also appears on 22.04, is that with the 4.15 release kernel or the HWE kernel (5.19 or 6.2).

summary: - fanctl causes hard system crash with wrong subnet mask
+ fanctl causes hard system lockup with wrong subnet mask
Revision history for this message
Alastair Flynn (aflynn50) wrote :

That is with the 5.19.0-50-generic kernel in 22.04.

Revision history for this message
Stefan Bader (smb) wrote :

Is there any additional info you could share about the setup? I tried locally with a 23.04 VM based on cloud-images. Using the command exactly as stated above does nothing at all (no error message but neither creates anything). Only if I match the 192.168... underlay to match the existing network in use, there is a fan-240 bridge create. Still no errors in the log. There seems to be something which I miss.

Revision history for this message
Alastair Flynn (aflynn50) wrote (last edit ):

Sorry for the slow reply. I've attached a some extra system information including the output of ip a and the packages installed. The laptop I am testing it on here was my old work laptop so is unfortunately quite bloated.

From my observations of the errors it seems that some IO is blocking in the kernel. Several different process have threads timeout including firefox and syncthing as well as wpa_supplicant. The system doesn't lock up straight away, it seems to only be when certain processes freeze, shutting down the PC through the GUI triggers a freeze.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.