arm64: linux: stress-ng filename stressor crashes kernel

Bug #2038768 reported by Colin Ian King
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned
Mantic
Confirmed
High
Unassigned

Bug Description

Running latest Ubuntu mantic (ext4 file system) with kernel: Linux mantic-arm64 6.5.0-7-generic #7-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep 28 19:12:05 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

How to reproduce:

Fire up a 24 instance ARM64 QEMU instance with Ubuntu Mantic Server. Install latest stress-ng from git repo:

sudo apt-get update
sudo apt-get build-dep stress-ng
git clone git://github.com/ColinIanKing/stress-ng
cd stress-ng
make clean
make -j 24
make verify-test-all

When we reach the filename stressor the kernel crashes as follows:

[ 902.594715] kernel BUG at fs/dcache.c:2050!
[ 902.598205] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[ 902.603127] Modules linked in: dccp_ipv4 dccp atm vfio_iommu_type1 vfio iommu
fd cmac algif_rng twofish_generic twofish_common serpent_generic fcrypt cast6_ge
neric cast5_generic cast_common camellia_generic blowfish_generic blowfish_commo
n aes_arm64 algif_skcipher algif_hash aria_generic sm4_generic sm4_neon ccm aes_
ce_ccm des_generic libdes authenc aegis128 algif_aead af_alg cfg80211 binfmt_mis
c nls_iso8859_1 dm_multipath drm efi_pstore dmi_sysfs qemu_fw_cfg ip_tables x_ta
bles autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy
 async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipa
th linear crct10dif_ce polyval_ce polyval_generic ghash_ce sm4 sha2_ce sha256_ar
m64 sha1_ce arm_smccc_trng xhci_pci virtio_rng xhci_pci_renesas aes_neon_bs aes_
neon_blk aes_ce_blk aes_ce_cipher
[ 902.689941] CPU: 1 PID: 91317 Comm: stress-ng-filen Not tainted 6.5.0-7-gener
ic #7-Ubuntu
[ 902.699281] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 902.706902] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 902.715488] pc : d_instantiate_new+0xa8/0xc8
[ 902.720889] lr : ext4_add_nondir+0x10c/0x160
[ 902.725702] sp : ffff80008b6d3930
[ 902.729390] x29: ffff80008b6d3930 x28: 0000000000000000 x27: ffffbd164e51a980
[ 902.738705] x26: ffff6789f3b68f20 x25: 0000000000008180 x24: ffff678a541f7968
[ 902.747003] x23: ffff6789f3b68f00 x22: ffff80008b6d39b0 x21: ffff678a6a25bcb0
[ 902.755776] x20: ffff678a36f8f028 x19: 0000000000000000 x18: ffff80008af45068
[ 902.764647] x17: 0000000000000000 x16: 0000000000000000 x15: ecececececececec
[ 902.773135] x14: ecececececececec x13: ecececececececec x12: ecececececececec
[ 902.781386] x11: ecececececececec x10: ecececececececec x9 : ffffbd164d5990bc
[ 902.789346] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 902.798564] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 902.806851] x2 : ffffbd16504e4ce0 x1 : ffff678a36f8f028 x0 : ffff6789f3b68f00
[ 902.815544] Call trace:
[ 902.818870] d_instantiate_new+0xa8/0xc8
[ 902.823523] ext4_create+0x120/0x238
[ 902.827716] lookup_open.isra.0+0x480/0x4d0
[ 902.832480] open_last_lookups+0x160/0x3b0
[ 902.837060] path_openat+0xa0/0x2a0
[ 902.840975] do_filp_open+0xa8/0x180
[ 902.845582] do_sys_openat2+0xe8/0x128
[ 902.850426] __arm64_sys_openat+0x70/0xe0
[ 902.854952] invoke_syscall+0x7c/0x128
[ 902.859155] el0_svc_common.constprop.0+0x5c/0x168
[ 902.864979] do_el0_svc+0x38/0x68
[ 902.869364] el0_svc+0x30/0xe0
[ 902.873401] el0t_64_sync_handler+0x148/0x158
[ 902.878336] el0t_64_sync+0x1b0/0x1b8
[ 902.882513] Code: d2800002 d2800010 d2800011 d65f03c0 (d4210000)
[ 902.890632] ---[ end trace 0000000000000000 ]---

Revision history for this message
Colin Ian King (colin-king) wrote (last edit ):

Note that just running stress-ng with --filename 0 will reproduce the issue. I'm testing this now on a cleanly formatted ext4 file system

Changed in linux (Ubuntu):
importance: Undecided → High
description: updated
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2038768

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Colin Ian King (colin-king) wrote :

I created a 1GB file and created a fresh ext4 file system on it and loop back mounted it on /mnt, I created test directory /mnt/test and ran:

/stress-ng --filename 0 --temp-path /mnt/test --klog-check

Managed to trip the kernel crash again. So it appears to occur on a fresh ext4 file system too :-(

Revision history for this message
Colin Ian King (colin-king) wrote :

Can't seem to trip the issue on a 24 core x86 instance, maybe this is ARM64 specific.

Revision history for this message
Colin Ian King (colin-king) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote (last edit ):

Reproduced this with mainline arm64 kernel https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.5/arm64/linux-image-unsigned-6.5.0-060500-generic_6.5.0-060500.202308271831_arm64.deb

[ 184.853731] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 184.862627] pc : d_instantiate_new+0xa8/0xc8
[ 184.867973] lr : ext4_add_nondir+0xf0/0x148
[ 184.872959] sp : ffff8000828ab950
[ 184.877059] x29: ffff8000828ab950 x28: 0000000000000000 x27: ffffd975b8b9a6c0
[ 184.885032] x26: ffff7b0094e32c20 x25: 0000000000008180 x24: ffff7b01432e9848
[ 184.893573] x23: ffff8000828aba30 x22: ffff7b0094e32c00 x21: ffff7b0172d574d0
[ 184.902071] x20: ffff7b0089fbc688 x19: 0000000000000000 x18: ffff800082295068
[ 184.910550] x17: 0000000000000000 x16: 0000000000000000 x15: 5e9ca062546ae354
[ 184.919056] x14: 998c9ec3ecc3a882 x13: 24d23ffaf8b470b6 x12: 022485883b51bee2
[ 184.927692] x11: 5c7ac5c18df459ab x10: 6e24d23ffaf8b470 x9 : ffffd975b7c3d730
[ 184.936212] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 184.944811] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 184.953651] x2 : ffffd975bab42cf0 x1 : ffff7b0089fbc688 x0 : ffff7b0094e32c00
[ 184.962508] Call trace:
[ 184.965316] d_instantiate_new+0xa8/0xc8
[ 184.969803] ext4_create+0x120/0x238
[ 184.973910] lookup_open.isra.0+0x478/0x4c8
[ 184.978689] open_last_lookups+0x160/0x3b0
[ 184.983374] path_openat+0x9c/0x290
[ 184.987372] do_filp_open+0xac/0x188
[ 184.991444] do_sys_openat2+0xe4/0x120
[ 184.995701] __arm64_sys_openat+0x6c/0xd8
[ 185.000271] invoke_syscall+0x7c/0x128
[ 185.004520] el0_svc_common.constprop.0+0x5c/0x168
[ 185.009977] do_el0_svc+0x38/0x68
[ 185.013775] el0_svc+0x30/0xe0
[ 185.017265] el0t_64_sync_handler+0x148/0x158
[ 185.022183] el0t_64_sync+0x1b0/0x1b8
[ 185.026332] Code: d2800002 d2800010 d2800011 d65f03c0 (d4210000)
[ 185.033606] ---[ end trace 0000000000000000 ]---

Took a while to trigger.

Revision history for this message
Colin Ian King (colin-king) wrote :

Reproduced this with mainline arm64 kernel https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.5.5/arm64/linux-image-unsigned-6.5.5-060505-generic_6.5.5-060505.202309230703_arm64.deb

[ 219.219042] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[ 219.262013] Modules linked in: cfg80211 binfmt_misc nls_iso8859_1 dm_multipat
h drm efi_pstore dmi_sysfs qemu_fw_cfg ip_tables x_tables autofs4 btrfs blake2b_
generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_t
x xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_ce poly
val_ce polyval_generic ghash_ce sm4 sha2_ce sha256_arm64 virtio_net sha1_ce arm_
smccc_trng virtio_rng net_failover xhci_pci failover xhci_pci_renesas aes_neon_b
s aes_neon_blk aes_ce_blk aes_ce_cipher
[ 219.322456] CPU: 13 PID: 1182 Comm: stress-ng-filen Not tainted 6.5.5-060505-
generic #202309230703
[ 219.332405] Hardware name: QEMU KVM Virtual Machine, BIOS 2023.05-2 09/23/202
3
[ 219.340433] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 219.348163] pc : d_instantiate_new+0xa8/0xc8
[ 219.352942] lr : ext4_add_nondir+0x10c/0x160
[ 219.357822] sp : ffff8000826ab9d0
[ 219.361517] x29: ffff8000826ab9d0 x28: 0000000000000000 x27: ffffa9b65720a940
[ 219.369535] x26: ffff1ea33582d2e0 x25: 0000000000008180 x24: ffff1ea3c2bb3d48
[ 219.377494] x23: ffff1ea33582d2c0 x22: ffff8000826abab0 x21: ffff1ea3c3344930
[ 219.385428] x20: ffff1ea324bda188 x19: 0000000000000000 x18: ffff800080b4d068
[ 219.393336] x17: 0000000000000000 x16: 0000000000000000 x15: 9afaefe7af176647
[ 219.401279] x14: f302afa80109b8f3 x13: a3469afaefe7af17 x12: 6647f302afa80109
[ 219.409258] x11: b4e7e46bc44fb52e x10: 4e81094291a860ce x9 : ffffa9b6562b1b74
[ 219.417639] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 219.426015] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 219.434462] x2 : ffffa9b6591b27e8 x1 : ffff1ea324bda188 x0 : ffff1ea33582d2c0
[ 219.442708] Call trace:
[ 219.445901] d_instantiate_new+0xa8/0xc8
[ 219.450786] ext4_create+0x120/0x238
[ 219.454800] lookup_open.isra.0+0x478/0x4c8
[ 219.459476] open_last_lookups+0x160/0x3b0
[ 219.464060] path_openat+0x9c/0x290
[ 219.468062] do_filp_open+0xac/0x188
[ 219.472175] do_sys_openat2+0xe4/0x120
[ 219.476412] __arm64_sys_openat+0x6c/0xd8
[ 219.481300] invoke_syscall+0x7c/0x128
[ 219.485876] el0_svc_common.constprop.0+0x5c/0x168
[ 219.491561] do_el0_svc+0x38/0x68
[ 219.495523] el0_svc+0x30/0xe0
[ 219.499161] el0t_64_sync_handler+0x148/0x158
[ 219.504139] el0t_64_sync+0x1b0/0x1b8
[ 219.508320] Code: d2800002 d2800010 d2800011 d65f03c0 (d4210000)
[ 219.515430] ---[ end trace 0000000000000000 ]---

Revision history for this message
Colin Ian King (colin-king) wrote :
Download full text (3.6 KiB)

And can reproduce on real H/W on a 24 core "SC2A11" is a multi-core chip with 24 cores of ARM® Cortex-A53. with Linux 6.5.0-7-generic #7-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep 28 19:12:05 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

[ 201.075720] EXT4-fs (loop13): mounted filesystem 52e32882-8b3a-47ce-8bf6-ce095960b1e7 r/w with ordered data mode. Quota mode: none.
[ 516.665218] ------------[ cut here ]------------
[ 516.665249] kernel BUG at fs/dcache.c:2050!
[ 516.665279] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
[ 516.665301] Modules linked in: tls vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc overlay cfg80211 binfmt_misc zfs(PO) nls_iso8859_1 spl(O) snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore uio_pdrv_genirq uio dm_multipath efi_pstore dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear nouveau crct10dif_ce drm_ttm_helper polyval_ce polyval_generic ttm ghash_ce i2c_algo_bit drm_display_helper cec rc_core sm4 drm_kms_helper sha2_ce sha256_arm64 xhci_pci drm r8169 sha1_ce ahci xhci_pci_renesas realtek sdhci_f_sdh30 sdhci_pltfm sdhci gpio_keys
[ 516.665743] netsec gpio_mb86s7x i2c_synquacer aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher
[ 516.665900] CPU: 2 PID: 17292 Comm: stress-ng-filen Tainted: P O 6.5.0-7-generic #7-Ubuntu
[ 516.665927] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #85 Nov 6 2020
[ 516.665948] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 516.665974] pc : d_instantiate_new+0xa8/0xc8
[ 516.666006] lr : ext4_add_nondir+0x10c/0x160
[ 516.666029] sp : ffff8000857838d0
[ 516.666043] x29: ffff8000857838d0 x28: 0000000000000000 x27: ffff8000816ea980
[ 516.666076] x26: ffff0008119915e0 x25: 0000000000008180 x24: ffff000856c61ce8
[ 516.666108] x23: ffff0008119915c0 x22: ffff800085783950 x21: ffff00080359e1c0
[ 516.666140] x20: ffff0008561b1ce8 x19: 0000000000000000 x18: ffff800085a6d068
[ 516.666172] x17: 0000000000000000 x16: 0000000000000000 x15: 878b4681cc52c99d
[ 516.666204] x14: d59de2a9feb89dca x13: 85e2878b4681cc52 x12: c99dd59de2a9feb8
[ 516.666236] x11: e3b9eedbdf1c7d27 x10: 732db84fa4ef339b x9 : ffff8000807690bc
[ 516.666268] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 516.666299] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 516.666330] x2 : ffff8000836b52e8 x1 : ffff0008561b1ce8 x0 : ffff0008119915c0
[ 516.666362] Call trace:
[ 516.666377] d_instantiate_new+0xa8/0xc8
[ 516.666401] ext4_create+0x120/0x238
[ 516.666422] lookup_open.isra.0+0x480/0x4d0
[ 516.666447] open_last_lookups+0x160/0x3b0
[ 516.666466] path_openat+0xa0/0x2a0
[ 516.666484] do_filp_open+0xa8/0x180
[ 516.666502] do_s...

Read more...

Revision history for this message
Colin Ian King (colin-king) wrote :
Changed in linux (Ubuntu Mantic):
status: Incomplete → New
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2038768

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Colin Ian King (colin-king) wrote :

Unable to collect data via apport-collect due to VPN restrictions.

Changed in linux (Ubuntu Mantic):
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.