LXD fan bridge causes blocked tasks

Bug #2064176 reported by Wesley Hershberger
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

Hi, cross posting this from https://github.com/canonical/lxd/issues/12161

I've got a lxd cluster running across 3 VMs using the fan bridge. I'm using a dev revision of LXD based on 6413a948. Creating a container causes the trace in the attached syslog snippet; this causes the container creation process to hang indefinitely. ssh logins, `lxc shell cluster1`, and `ps -aux` also hang.

Apr 29 17:15:01 cluster1 kernel: [ 161.250951] ------------[ cut here ]------------
Apr 29 17:15:01 cluster1 kernel: [ 161.250957] Voluntary context switch within RCU read-side critical section!
Apr 29 17:15:01 cluster1 kernel: [ 161.250990] WARNING: CPU: 2 PID: 510 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [ 161.251003] Modules linked in: nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 vxlan ip6_udp_tunnel udp_tunnel dummy br
idge stp llc zfs(PO) spl(O) nf_tables libcrc32c nfnetlink vhost_vsock vhost vhost_iotlb binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass crct10dif
_pclmul crc32_pclmul virtio_gpu polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 virtio_dma_buf aesni_intel vmw_vsock_virtio_transport 9pnet_virtio xhci_
pci drm_shmem_helper i2c_i801 ahci 9pnet vmw_vsock_virtio_transport_common xhci_pci_renesas drm_kms_helper libahci crypto_simd joydev virtio_input cryptd lpc_ich virtiofs i2c_smbus
 vsock psmouse input_leds mac_hid serio_raw rapl qemu_fw_cfg vmgenid nfsd dm_multipath auth_rpcgss scsi_dh_rdac nfs_acl lockd scsi_dh_emc scsi_dh_alua grace sch_fq_codel drm sunrpc
 efi_pstore virtio_rng ip_tables x_tables autofs4
Apr 29 17:15:01 cluster1 kernel: [ 161.251085] CPU: 2 PID: 510 Comm: nmbd Tainted: P O 6.5.0-28-generic #29~22.04.1-Ubuntu
Apr 29 17:15:01 cluster1 kernel: [ 161.251089] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022
Apr 29 17:15:01 cluster1 kernel: [ 161.251091] RIP: 0010:rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [ 161.251095] Code: 08 f0 83 44 24 fc 00 48 89 de 4c 89 f7 e8 d1 af ff ff e9 1e fe ff ff 48 c7 c7 d0 60 56 88 c6 05 e6 27 40 02 01 e8 79 b2 f2 ff
<0f> 0b e9 bd fd ff ff a9 ff ff ff 7f 0f 84 75 fe ff ff 65 48 8b 3c
Apr 29 17:15:01 cluster1 kernel: [ 161.251098] RSP: 0018:ffffb9cbc11dbbc8 EFLAGS: 00010046
Apr 29 17:15:01 cluster1 kernel: [ 161.251101] RAX: 0000000000000000 RBX: ffff941ef7cb3f80 RCX: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [ 161.251103] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [ 161.251104] RBP: ffffb9cbc11dbbe8 R08: 0000000000000000 R09: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [ 161.251106] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [ 161.251111] R13: ffff941d893e9980 R14: 0000000000000000 R15: ffff941d80ad7a80
Apr 29 17:15:01 cluster1 kernel: [ 161.251113] FS: 00007c7dcbdb8a00(0000) GS:ffff941ef7c80000(0000) knlGS:0000000000000000
Apr 29 17:15:01 cluster1 kernel: [ 161.251115] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 29 17:15:01 cluster1 kernel: [ 161.251117] CR2: 00005a30877ae488 CR3: 0000000105888003 CR4: 0000000000170ee0
Apr 29 17:15:01 cluster1 kernel: [ 161.251122] Call Trace:
Apr 29 17:15:01 cluster1 kernel: [ 161.251128] <TASK>
Apr 29 17:15:01 cluster1 kernel: [ 161.251133] ? show_regs+0x6d/0x80
Apr 29 17:15:01 cluster1 kernel: [ 161.251145] ? __warn+0x89/0x160
Apr 29 17:15:01 cluster1 kernel: [ 161.251152] ? rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [ 161.251155] ? report_bug+0x17e/0x1b0
Apr 29 17:15:01 cluster1 kernel: [ 161.251172] ? handle_bug+0x46/0x90
Apr 29 17:15:01 cluster1 kernel: [ 161.251187] ? exc_invalid_op+0x18/0x80
Apr 29 17:15:01 cluster1 kernel: [ 161.251190] ? asm_exc_invalid_op+0x1b/0x20
Apr 29 17:15:01 cluster1 kernel: [ 161.251202] ? rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [ 161.251205] ? rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [ 161.251208] __schedule+0xcc/0x750
Apr 29 17:15:01 cluster1 kernel: [ 161.251218] schedule+0x63/0x110
Apr 29 17:15:01 cluster1 kernel: [ 161.251222] schedule_hrtimeout_range_clock+0xbc/0x130
Apr 29 17:15:01 cluster1 kernel: [ 161.251238] ? __pfx_hrtimer_wakeup+0x10/0x10
Apr 29 17:15:01 cluster1 kernel: [ 161.251245] schedule_hrtimeout_range+0x13/0x30
Apr 29 17:15:01 cluster1 kernel: [ 161.251248] ep_poll+0x33f/0x390
Apr 29 17:15:01 cluster1 kernel: [ 161.251254] ? __pfx_ep_autoremove_wake_function+0x10/0x10
Apr 29 17:15:01 cluster1 kernel: [ 161.251257] do_epoll_wait+0xdb/0x100
Apr 29 17:15:01 cluster1 kernel: [ 161.251259] __x64_sys_epoll_wait+0x6f/0x110
Apr 29 17:15:01 cluster1 kernel: [ 161.251265] do_syscall_64+0x5b/0x90
Apr 29 17:15:01 cluster1 kernel: [ 161.251270] ? do_epoll_ctl+0x3cb/0x860
Apr 29 17:15:01 cluster1 kernel: [ 161.251273] ? __task_pid_nr_ns+0x6c/0xc0
Apr 29 17:15:01 cluster1 kernel: [ 161.251279] ? exit_to_user_mode_prepare+0x30/0xb0
Apr 29 17:15:01 cluster1 kernel: [ 161.251284] ? syscall_exit_to_user_mode+0x37/0x60
Apr 29 17:15:01 cluster1 kernel: [ 161.251286] ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [ 161.251288] ? syscall_exit_to_user_mode+0x37/0x60
Apr 29 17:15:01 cluster1 kernel: [ 161.251300] ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [ 161.251304] ? syscall_exit_to_user_mode+0x37/0x60
Apr 29 17:15:01 cluster1 kernel: [ 161.251306] ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [ 161.251309] ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [ 161.251313] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Apr 29 17:15:01 cluster1 kernel: [ 161.251316] RIP: 0033:0x7c7dcf325dea
Apr 29 17:15:01 cluster1 kernel: [ 161.251333] Code: 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 e8 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 89 54 24 18
Apr 29 17:15:01 cluster1 kernel: [ 161.251335] RSP: 002b:00007ffdde5e0278 EFLAGS: 00000246 ORIG_RAX: 00000000000000e8
Apr 29 17:15:01 cluster1 kernel: [ 161.251338] RAX: ffffffffffffffda RBX: 00005a30877a2ea0 RCX: 00007c7dcf325dea
Apr 29 17:15:01 cluster1 kernel: [ 161.251340] RDX: 0000000000000001 RSI: 00007ffdde5e02ac RDI: 0000000000000005
Apr 29 17:15:01 cluster1 kernel: [ 161.251341] RBP: 00005a3087794590 R08: 00000000000f423f R09: 00007ffdde5e0357
Apr 29 17:15:01 cluster1 kernel: [ 161.251343] R10: 00000000000003e8 R11: 0000000000000246 R12: 00005a30877a2f30
Apr 29 17:15:01 cluster1 kernel: [ 161.251345] R13: 00000000000003e8 R14: 0000000000000090 R15: 000000000000000a
Apr 29 17:15:01 cluster1 kernel: [ 161.251348] </TASK>
Apr 29 17:15:01 cluster1 kernel: [ 161.251349] ---[ end trace 0000000000000000 ]---

Revision history for this message
Wesley Hershberger (whershberger) wrote :
Revision history for this message
Wesley Hershberger (whershberger) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.