Activity log for bug #1982456

Date Who What changed Old value New value Message
2022-07-21 07:17:53 Fred Kimmy bug added bug
2022-07-21 07:19:31 Fred Kimmy nominated for series kunpeng920/ubuntu-20.04-hwe
2022-07-21 07:19:31 Fred Kimmy bug task added kunpeng920/ubuntu-20.04-hwe
2022-07-21 07:20:13 Fred Kimmy description Summary: dangling pointers are left when unregister device with reference to channels Further information: kunpeng920 bug reporting guidelines: Please use the following bug template: [Bug Description] Currently if dma_async_device_unregister is invoked while some clients still hold a reference to some channels it would prevent device to be released which would leave dangling pointers inside dma_device_list and cause crashes in methods that tries to use it. [Steps to Reproduce] 1) ismod async_tx.ko and hisi_dma.ko. 2) unbind the DMA devices that is bound with hisi_dma. 3) bind the DMA device with hisi_dma. 4) repeat 2) and 3) for several times. [Actual Results] 1) the refcout of hisi_dma is not zero after step 1). 2) After all DMA devices are unbound, the refcout of hisi_dma is not zero. 3) For the first unbinding, warn about __dma_async_device_channel_unregister called while some clients hold the references. 4) after unbinding the device, following calltraces may be reported: [ 1594.902108] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 1594.910871] Mem abort info: [ 1594.913653] ESR = 0x96000004 [ 1594.916713] EC = 0x25: DABT (current EL), IL = 32 bits [ 1594.922019] SET = 0, FnV = 0 [ 1594.925069] EA = 0, S1PTW = 0 [ 1594.928207] Data abort info: [ 1594.931084] ISV = 0, ISS = 0x00000004 [ 1594.934916] CM = 0, WnR = 0 [ 1594.937870] user pgtable: 4k pages, 48-bit VAs, pgdp=00000020e8b2a000 [ 1594.944294] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 1594.951077] Internal error: Oops: 96000004 [#1] SMP [ 1595.098816] pstate: a0400009 (NzCv daif +PAN -UAO -TCO BTYPE=--) [ 1595.104797] pc : dma_channel_rebalance+0xf8/0x338 [ 1595.109491] lr : dma_channel_rebalance+0xa0/0x338 [ 1595.114173] sp : ffff8000b759bb00 [ 1595.117472] x29: ffff8000b759bb00 x28: ffff002085903d00 [ 1595.122761] x27: 0000000000000000 x26: 0000000000000000 [ 1595.128048] x25: 0000000000000000 x24: ffffa8dfb33bd3b8 [ 1595.133337] x23: 000000000000000f x22: ffffa8dfb312e440 [ 1595.138624] x21: 0000000000000010 x20: ffffa8dfb312e660 [ 1595.143913] x19: ffffa8dfb3593e80 x18: 0000000000000000 [ 1595.149201] x17: 0000000000000000 x16: ffffa8dfb1ec19e8 [ 1595.154488] x15: 0000000000000040 x14: ffffa8dfb34210a0 [ 1595.159777] x13: 0000000000000228 x12: 0000000000000000 [ 1595.165064] x11: 0000000000000000 x10: 0000000000000000 [ 1595.170352] x9 : ffffa8dfb1b48f58 x8 : ffff0020e81336e0 [ 1595.175640] x7 : 0000000000000000 x6 : 0000000000000003 [ 1595.180929] x5 : 0000000000000000 x4 : 0000000000000000 [ 1595.186218] x3 : ffff20200f9180a0 x2 : ffff20200f918090 [ 1595.191505] x1 : 0000000000000000 x0 : ffffffffffffffc8 [ 1595.196794] Call trace: [ 1595.199230] dma_channel_rebalance+0xf8/0x338 [ 1595.203566] dma_async_device_unregister+0x90/0x148 [ 1595.208424] dmam_device_release+0x1c/0x28 [ 1595.212502] release_nodes+0x1c0/0x240 [ 1595.216243] devres_release_all+0x68/0x2c0 [ 1595.220322] device_release_driver_internal+0x138/0x1e8 [ 1595.225524] device_driver_detach+0x20/0x30 [ 1595.229689] unbind_store+0xe8/0x110 [ 1595.233247] drv_attr_store+0x2c/0x40 [ 1595.236893] sysfs_kf_write+0x4c/0x60 [ 1595.240544] kernfs_fop_write_iter+0x130/0x1c0 [ 1595.244969] new_sync_write+0xf0/0x198 [ 1595.248706] vfs_write+0x1ec/0x2c0 [ 1595.252094] ksys_write+0x74/0x108 [ 1595.255482] __arm64_sys_write+0x24/0x30 [ 1595.259387] el0_svc_common.constprop.0+0x84/0x218 [ 1595.264163] do_el0_svc+0x2c/0x98 [ 1595.267464] el0_svc+0x20/0x30 [ 1595.270512] el0_sync_handler+0xb0/0xb8 [ 1595.274331] el0_sync+0x184/0x1c0 [ 1595.277634] Code: eb00007f d100e000 54fffec0 d503201f (f9401c01) [ 1595.283701] ---[ end trace 2476858dc1c23bcf ]--- 5) The similar calltrace about dma_channel_rebalance may be reported on dma_async_device_register during binding. 6) Results 4) or 5) are always produced during step 4). [Expected Results] 1. The refcout of hisi_dma and of DMA channels decrease correctly when unbind devices even if there are some clients hold references to DMA channels. 2. After all DMA devices are unbound, the refcout of hisi_dma is zero. 3. There are no dangling pointers in dma_device_list. 4. The calltrace in step 3) never occurs. [Reproducibility] This problem occurs according to the procedure for reproducing the problem. [Additional information] (Firmware version, kernel version, affected hardware, etc. if required022041209162): BIOS:5.B211.01 OS:Ubuntu 20.04.3 LTS kernel version:Linux tx 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:08 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux [Resolution] Summary: dangling pointers are left when unregister device with reference to channels Further information: kunpeng920 bug reporting guidelines: Please use the following bug template: [Bug Description] Currently if dma_async_device_unregister is invoked while some clients still hold a reference to some channels it would prevent device to be released which would leave dangling pointers inside dma_device_list and cause crashes in methods that tries to use it. [Steps to Reproduce] 1) ismod async_tx.ko and hisi_dma.ko. 2) unbind the DMA devices that is bound with hisi_dma. 3) bind the DMA device with hisi_dma. 4) repeat 2) and 3) for several times. [Actual Results] 1) the refcout of hisi_dma is not zero after step 1). 2) After all DMA devices are unbound, the refcout of hisi_dma is not zero. 3) For the first unbinding, warn about __dma_async_device_channel_unregister called while some clients hold the references. 4) after unbinding the device, following calltraces may be reported: [ 1594.902108] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 1594.910871] Mem abort info: [ 1594.913653] ESR = 0x96000004 [ 1594.916713] EC = 0x25: DABT (current EL), IL = 32 bits [ 1594.922019] SET = 0, FnV = 0 [ 1594.925069] EA = 0, S1PTW = 0 [ 1594.928207] Data abort info: [ 1594.931084] ISV = 0, ISS = 0x00000004 [ 1594.934916] CM = 0, WnR = 0 [ 1594.937870] user pgtable: 4k pages, 48-bit VAs, pgdp=00000020e8b2a000 [ 1594.944294] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 1594.951077] Internal error: Oops: 96000004 [#1] SMP [ 1595.098816] pstate: a0400009 (NzCv daif +PAN -UAO -TCO BTYPE=--) [ 1595.104797] pc : dma_channel_rebalance+0xf8/0x338 [ 1595.109491] lr : dma_channel_rebalance+0xa0/0x338 [ 1595.114173] sp : ffff8000b759bb00 [ 1595.117472] x29: ffff8000b759bb00 x28: ffff002085903d00 [ 1595.122761] x27: 0000000000000000 x26: 0000000000000000 [ 1595.128048] x25: 0000000000000000 x24: ffffa8dfb33bd3b8 [ 1595.133337] x23: 000000000000000f x22: ffffa8dfb312e440 [ 1595.138624] x21: 0000000000000010 x20: ffffa8dfb312e660 [ 1595.143913] x19: ffffa8dfb3593e80 x18: 0000000000000000 [ 1595.149201] x17: 0000000000000000 x16: ffffa8dfb1ec19e8 [ 1595.154488] x15: 0000000000000040 x14: ffffa8dfb34210a0 [ 1595.159777] x13: 0000000000000228 x12: 0000000000000000 [ 1595.165064] x11: 0000000000000000 x10: 0000000000000000 [ 1595.170352] x9 : ffffa8dfb1b48f58 x8 : ffff0020e81336e0 [ 1595.175640] x7 : 0000000000000000 x6 : 0000000000000003 [ 1595.180929] x5 : 0000000000000000 x4 : 0000000000000000 [ 1595.186218] x3 : ffff20200f9180a0 x2 : ffff20200f918090 [ 1595.191505] x1 : 0000000000000000 x0 : ffffffffffffffc8 [ 1595.196794] Call trace: [ 1595.199230] dma_channel_rebalance+0xf8/0x338 [ 1595.203566] dma_async_device_unregister+0x90/0x148 [ 1595.208424] dmam_device_release+0x1c/0x28 [ 1595.212502] release_nodes+0x1c0/0x240 [ 1595.216243] devres_release_all+0x68/0x2c0 [ 1595.220322] device_release_driver_internal+0x138/0x1e8 [ 1595.225524] device_driver_detach+0x20/0x30 [ 1595.229689] unbind_store+0xe8/0x110 [ 1595.233247] drv_attr_store+0x2c/0x40 [ 1595.236893] sysfs_kf_write+0x4c/0x60 [ 1595.240544] kernfs_fop_write_iter+0x130/0x1c0 [ 1595.244969] new_sync_write+0xf0/0x198 [ 1595.248706] vfs_write+0x1ec/0x2c0 [ 1595.252094] ksys_write+0x74/0x108 [ 1595.255482] __arm64_sys_write+0x24/0x30 [ 1595.259387] el0_svc_common.constprop.0+0x84/0x218 [ 1595.264163] do_el0_svc+0x2c/0x98 [ 1595.267464] el0_svc+0x20/0x30 [ 1595.270512] el0_sync_handler+0xb0/0xb8 [ 1595.274331] el0_sync+0x184/0x1c0 [ 1595.277634] Code: eb00007f d100e000 54fffec0 d503201f (f9401c01) [ 1595.283701] ---[ end trace 2476858dc1c23bcf ]--- 5) The similar calltrace about dma_channel_rebalance may be reported  on dma_async_device_register during binding. 6) Results 4) or 5) are always produced during step 4). [Expected Results] 1. The refcout of hisi_dma and of DMA channels decrease correctly when unbind devices even if there are some clients hold references to DMA channels. 2. After all DMA devices are unbound, the refcout of hisi_dma is zero. 3. There are no dangling pointers in dma_device_list. 4. The calltrace in step 3) never occurs. [Reproducibility] This problem occurs according to the procedure for reproducing the problem. [Additional information] (Firmware version, kernel version, affected hardware, etc. if required022041209162): kernel version:Linux tx 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:08 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux [Resolution]
2022-07-25 08:27:37 Ike Panhc kunpeng920/ubuntu-20.04-hwe: status New Incomplete
2022-07-25 08:27:39 Ike Panhc kunpeng920: status New Incomplete
2023-03-15 06:47:31 Ike Panhc kunpeng920/ubuntu-20.04-hwe: status Incomplete Won't Fix
2023-03-15 06:47:34 Ike Panhc kunpeng920: status Incomplete Won't Fix