net:tls sefltest caused an oops in scobee for lunar:generic-arm64k kernel

Bug #2023051 reported by Roxana Nicolescu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned
linux (Ubuntu)
New
Undecided
Unassigned
Mantic
In Progress
Undecided
Ike Panhc

Bug Description

Trace
[ 1835.154044] hisi_sec2 0000:76:00.0: the number of entries in input scatterlist is bigger than SGL pool setting.

[ 1956.475063] Unable to handle kernel paging request at virtual address dead000000000122
[ 1956.482962] Mem abort info:
[ 1956.485744] ESR = 0x0000000096000044
[ 1956.489476] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1956.494766] SET = 0, FnV = 0
[ 1956.497806] EA = 0, S1PTW = 0
[ 1956.500933] FSC = 0x04: level 0 translation fault
[ 1956.505788] Data abort info:
[ 1956.508655] ISV = 0, ISS = 0x00000044
[ 1956.512474] CM = 0, WnR = 1
[ 1956.515428] [dead000000000122] address between user and kernel address ranges
[ 1956.522533] Internal error: Oops: 0000000096000044 [#1] SMP
[ 1956.528083] Modules linked in: cfg80211 binfmt_misc nls_iso8859_1 onboard_usb_hub ipmi_ssif hisi_hpre hisi_sec2 ecdh_generic hisi_zip libcurve25519_generic hisi_qm hns_roce_hw_v2 acpi_ipmi ecc uacce authenc ipmi_si ipmi_devintf hisi_uncore_ddrc_pmu arm_spe_pmu ipmi_msghandler hisi_uncore_l3c_pmu hisi_uncore_hha_pmu hisi_uncore_pmu hisi_trng_v2 cppc_cpufreq dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure realtek mlx5_ib ib_uverbs ib_core hclge hibmc_drm crct10dif_ce drm_vram_helper polyval_ce drm_ttm_helper polyval_generic ttm i2c_algo_bit ghash_ce mlx5_core drm_kms_helper syscopyarea sm4 mlxfw sysfillrect sha2_ce psample sysimgblt sha256_arm64 hisi_sas_v3_hw tls sha1_ce hns3 hisi_sas_main xhci_pci pci_hyperv_intf drm hnae3 xhci_pci_renesas libsas ahci
[ 1956.528174] scsi_transport_sas spi_dw_mmio spi_dw gpio_dwapb aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher
[ 1956.623600] CPU: 20 PID: 2172 Comm: kworker/u262:1 Tainted: G B 6.2.0-23-generic-64k #23-Ubuntu
[ 1956.633641] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B160.01 01/15/2020
[ 1956.642470] Workqueue: 0000:76:00.0 qm_work_process [hisi_qm]
[ 1956.648206] pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1956.655136] pc : free_pcppages_bulk+0x1ac/0x320
[ 1956.659651] lr : free_pcppages_bulk+0x174/0x320
[ 1956.664160] sp : ffff80008e6afaa0
[ 1956.667460] x29: ffff80008e6afaa0 x28: ffff003ff78e0b80 x27: 0000000000000000
[ 1956.674565] x26: 00000000000001bd x25: 0000000000000000 x24: 000000000000000c
[ 1956.681669] x23: ffff003ff78e0b98 x22: ffff003ff78e0b80 x21: dead000000000100
[ 1956.688773] x20: ffffffc0085fe480 x19: 0000000000000001 x18: ffff80004bea0060
[ 1956.695877] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000010000
[ 1956.702981] x14: 0000000000000000 x13: 0000000000000f38 x12: 9ac69b4cde237290
[ 1956.710085] x11: 7f7f7f7f7f7f7f7f x10: 0000000000000000 x9 : ffffbe146b27be84
[ 1956.717188] x8 : ffff80008e6afb08 x7 : 0000000000000000 x6 : 0000000000000000
[ 1956.724292] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffffc0085fe488
[ 1956.731396] x2 : dead000000000122 x1 : ffff003ff78e0b98 x0 : dead000000000122
[ 1956.738500] Call trace:
[ 1956.740937] free_pcppages_bulk+0x1ac/0x320
[ 1956.745102] free_unref_page_commit+0x170/0x270
[ 1956.749611] free_unref_page+0x1e8/0x2e0
[ 1956.753515] __folio_put+0x54/0xc0
[ 1956.756905] tls_decrypt_done+0x190/0x230 [tls]
[ 1956.761424] sec_aead_callback+0x130/0x240 [hisi_sec2]
[ 1956.766544] sec_req_cb+0xc4/0x2f0 [hisi_sec2]
[ 1956.770974] qm_poll_req_cb+0x88/0x1b0 [hisi_qm]
[ 1956.775576] qm_work_process+0x150/0x1cc [hisi_qm]
[ 1956.780350] process_one_work+0x244/0x4f0
[ 1956.784346] worker_thread+0x78/0x450
[ 1956.787992] kthread+0xf0/0x100
[ 1956.791123] ret_from_fork+0x10/0x20
[ 1956.794686] Code: d1002074 a9400061 f9401284 f9000420 (f9000001)
[ 1956.800755] ---[ end trace 0000000000000000 ]---
[ 1956.818443] pstore: crypto_comp_compress failed, ret = -22!
[ 1956.961204] note: kworker/u262:1[2172] exited with irqs disabled
[ 1956.967206] note: kworker/u262:1[2172] exited with preempt_count 3
[ 2016.984213] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 2016.990298] rcu: 35-...0: (4 GPs behind) idle=6ef4/1/0x4000000000000000 softirq=4849/4849 fqs=6370
[ 2016.999307] rcu: 53-...0: (2 GPs behind) idle=c644/1/0x4000000000000000 softirq=1484/1485 fqs=6371
[ 2017.008319] (detected by 0, t=15008 jiffies, g=226269, q=1901 ncpus=128)
[ 2017.015078] Task dump for CPU 35:
[ 2017.018380] task:kworker/35:0 state:R running task stack:0 pid:32035 ppid:2 flags:0x0000000a

Full log attached

Last test run is selftest net:tls. And from the trace tls_decrypt_done who triggers it.

Revision history for this message
Roxana Nicolescu (roxanan) wrote :
tags: added: sru-20230515
description: updated
tags: added: ssru-20230515
removed: sru-20230515
Po-Hsu Lin (cypressyew)
tags: added: sru-20230515
removed: ssru-20230515
tags: added: 6.2 lunar ubuntu-kernel-selftests
Revision history for this message
Roxana Nicolescu (roxanan) wrote :
Download full text (14.1 KiB)

Similar issue seen on lunar-lowlatency-1008

scobee-kernel login: [ 148.687229] usb 1-1.1: device descriptor read/64, error -110
[ 164.551237] usb 1-1.1: device descriptor read/64, error -110
[ 180.417094] usb 1-1.1: device descriptor read/64, error -110
[ 192.317974] usb 1-1.1: device not accepting address 9, error -110
[ 203.578051] usb 1-1.1: device not accepting address 10, error -110
[ 203.584605] usb 1-1-port1: unable to enumerate USB device
NOTICE: [D06_nopmu_pwr_domain_on]:[68L]

NOTICE: [scpi_set_css_power_state]:[249L] Modify S1 TB Boot Address to 3fc00000

NOTICE: [scpi_set_css_power_state]:[250L] ulpos_mpidr = 0x7 domain_bit = 0x3

NOTICE: [scpi_set_css_power_state]:[299L] ulpos_mpidr = 0x7 domain_bit = 0x3

NOTICE: [193l]Dieid = 0x7 ClusterID = 0x7 domain_bit = 0x3 Value to [ddff973dL]..

NOTICE: [158l]Dieid = 0x7 ClusterID = 0x7 ClustIdx = 0x1 domain_bit = 0x3 Value to [f0f0f0fL]..

[ 2101.024402] BUG: Bad page state in process tls pfn:400cdc8
[ 2101.038665] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 2101.047446] Mem abort info:
[ 2101.050244] ESR = 0x0000000086000004
[ 2101.053987] EC = 0x21: IABT (current EL), IL = 32 bits
[ 2101.053995] SET = 0, FnV = 0
[ 2101.053998] EA = 0, S1PTW = 0
[ 2101.062334] FSC = 0x04: level 0 translation fault
[ 2101.070324] user pgtable: 4k pages, 48-bit VAs, pgdp=0000002096d1d000
[ 2101.076749] [0000000000000000] pgd=0000000000000000
[ 2101.081633] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff0040094ab400
[ 2101.081643] Internal error: Oops: 0000000086000004 [#1] PREEMPT SMP
[ 2101.088744] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff800053b63d10
[ 2101.094984] Modules linked in:
[ 2101.094985]
[ 2101.094988] x2 : 0000000000001000 x1 : fffffc0100b6c180
[ 2101.102098] cfg80211
[ 2101.105139] x0 : 057fff8000000000
[ 2101.106633] binfmt_misc
[ 2101.111834]
[ 2101.114105] nls_iso8859_1
[ 2101.117493] Call trace:
[ 2101.120021] ipmi_ssif
[ 2101.121510] copy_page_to_iter+0x58/0x2e0
[ 2101.124210] acpi_ipmi
[ 2101.126647] pipe_read+0x150/0x4d0
[ 2101.129006] onboard_usb_hub
[ 2101.132997] vfs_read+0x2bc/0x2e4
[ 2101.135356] ipmi_si
[ 2101.138742] ksys_read+0x110/0x130
[ 2101.141619] hisi_hpre
[ 2101.144921] __arm64_sys_read+0x28/0x50
[ 2101.147105] ecdh_generic
[ 2101.150491] invoke_syscall+0x7c/0x124
[ 2101.152849] ipmi_devintf
[ 2101.156668] el0_svc_common.constprop.0+0x5c/0x1cc
[ 2101.159285] libcurve25519_generic
[ 2101.163018] do_el0_svc+0x38/0x60
[ 2101.165635] hisi_zip
[ 2101.170405] el0_svc+0x30/0xe0
[ 2101.173799] hns_roce_hw_v2
[ 2101.177100] el0t_64_sync_handler+0x11c/0x150
[ 2101.179371] hisi_sec2
[ 2101.182413] el0t_64_sync+0x1a8/0x1ac
[ 2101.185204] ipmi_msghandler
[ 2101.189542] ---[ end trace 0000000000000000 ]---
[ 2101.191898] hisi_qm ecc uacce authenc hisi_trng_v2 hisi_uncore_ddrc_pmu arm_spe_pmu hisi_uncore_hha_pmu hisi_uncore_l3c_pmu hisi_uncore_pmu cppc_cpufreq dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_n...

tags: added: 20230612
Po-Hsu Lin (cypressyew)
tags: added: sru-20230612
removed: 20230612
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This is affecting node hidon jammy/linux-nvidia-6.2/6.2.0-1011.11 as well.

tags: added: sru-20231002
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Download full text (6.0 KiB)

Call trace on node scobee with Mantic 6.5.0-35
[ 172.716063] usb 1-1.1: device descriptor read/64, error -110
[ 184.112021] usb 1-1.1: device not accepting address 9, error -110
[ 194.866274] usb 1-1.1: device not accepting address 10, error -110
[ 194.872609] usb 1-1-port1: unable to enumerate USB device
[ 1192.685953] BUG: Bad page state in process tls pfn:2126e24
[ 1192.692086] BUG: Bad page state in process tls pfn:2086456
[ 1192.700872] Unable to handle kernel paging request at virtual address 0000100000000000
[ 1192.708842] Mem abort info:
[ 1192.711639] ESR = 0x0000000086000004
[ 1192.715413] EC = 0x21: IABT (current EL), IL = 32 bits
[ 1192.720741] SET = 0, FnV = 0
[ 1192.723822] EA = 0, S1PTW = 0
[ 1192.726987] FSC = 0x04: level 0 translation fault
[ 1192.731882] user pgtable: 4k pages, 48-bit VAs, pgdp=0000002086b44000
[ 1192.738339] [0000100000000000] pgd=0000000000000000, p4d=0000000000000000
[ 1192.745152] Internal error: Oops: 0000000086000004 [#1] SMP
[ 1192.750729] Modules linked in: cfg80211 binfmt_misc nls_iso8859_1 ipmi_ssif hisi_hpre ecdh_generic hns_roce_hw_v2 libcurve25519_generic hisi_uncore_hha_pmu hisi_uncore_ddrc_pmu hisi_uncore_l3c_pmu onboard_usb_hub ses ecc hisi_sec2 hisi_zip hisi_uncore_pmu acpi_ipmi authenc hisi_qm arm_spe_pmu enclosure uacce ipmi_si ipmi_devintf hisi_trng_v2 ipmi_msghandler cppc_cpufreq dm_multipath efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core realtek crct10dif_ce hclge polyval_ce polyval_generic hibmc_drm ghash_ce drm_vram_helper drm_ttm_helper mlx5_core ttm sm4 i2c_algo_bit hisi_sas_v3_hw mlxfw sha2_ce drm_kms_helper hisi_sas_main psample sha256_arm64 sha1_ce libsas hns3 tls xhci_pci drm pci_hyperv_intf hnae3 xhci_pci_renesas scsi_transport_sas ahci spi_dw_mmio spi_dw gpio_dwapb aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher
[ 1192.779614] hisi_sec2 0000:76:00.0: the number of entries in input scatterlist is bigger than SGL pool setting.
[ 1192.839040] CPU: 38 PID: 0 Comm: swapper/38 Tainted: G B W 6.5.0-35-generic #35-Ubuntu
[ 1192.839063] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B160.01 01/15/2020
[ 1192.839073] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1192.839091] pc : 0x100000000000
[ 1192.849192] hisi_sec2 0000:76:00.0: fail to dma map output sgl buffers!
[ 1192.858243] lr : rcu_do_batch+0x194/0x4a8
[ 1192.858278] sp : ffff800080133e50
[ 1192.858285] x29: ffff800080133e70 x28: 0000000000000001 x27: 0000000000000001
[ 1192.898211] x26: ffffc1f77771ac18 x25: ffff005f7fad64f8 x24: 000000000000000a
[ 1192.905343] x23: 0000000000000000 x22: 0000000000000008 x21: ffff005f7fad6480
[ 1192.912473] x20: ffff0040000a2180 x19: 0000000000000009 x18: ffff800080135028
[ 1192.919477] BUG: Bad page state in process tls pfn:208f1c3
[ 1192.919603] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 1192.925170] page:00000000efad163a refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x208f1c3
[ 1192.925185] flags:...

Read more...

tags: added: 6.5 mantic sru-s20240401
Po-Hsu Lin (cypressyew)
Changed in linux (Ubuntu Mantic):
assignee: nobody → Ike Panhc (ikepanhc)
Ike Panhc (ikepanhc)
Changed in linux (Ubuntu Mantic):
status: New → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.