Activity log for bug #1998738

Date Who What changed Old value New value Message
2022-12-05 05:20:58 Po-Hsu Lin bug added bug
2022-12-05 05:21:18 Po-Hsu Lin tags 5.4 focal ubuntu-stress-smoke-test
2022-12-05 05:40:09 Po-Hsu Lin description This issue can only be reproduced on ZCU106, it will cause some leftover processes running and eventually cause the jenkins job hang. stress-ng with commit 91ec6bccd7 (V0.15.00) stress-ng: invoked with './stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root' stress-ng: system: '202008-28164-ZCU106' Linux 5.4.0-1019-xilinx-zynqmp #22-Ubuntu SMP Thu Nov 17 05:04:22 UTC 2022 aarch64 stress-ng: memory (MB): total 3929.76, free 2479.07, shared 4.30, buffer 59.98, swap 0.00, free swap 0.00 stress-ng: info: [3037] setting to a 5 second run per stressor stress-ng: info: [3037] dispatching hogs: 4 dev kernel: [ 981.702313] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created kernel: [ 981.702829] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released kernel: [ 981.708039] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created kernel: [ 981.708569] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released kernel: [ 981.709027] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created kernel: [ 981.709501] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released kernel: [ 981.734320] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created kernel: [ 981.734859] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ... kernel:[ 981.797006] Internal error: Oops: 96000004 [#1] SMP kernel: [ 981.768878] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created kernel: [ 981.768958] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.768961] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000087000000f48 kernel: [ 981.768966] Mem abort info: kernel: [ 981.779704] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.782475] ESR = 0x96000004 kernel: [ 981.782478] EC = 0x25: DABT (current EL), IL = 32 bits kernel: [ 981.782480] SET = 0, FnV = 0 kernel: [ 981.782484] EA = 0, S1PTW = 0 kernel: [ 981.785524] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.790822] Data abort info: kernel: [ 981.790824] ISV = 0, ISS = 0x00000004 kernel: [ 981.790826] CM = 0, WnR = 0 kernel: [ 981.790830] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000838768000 kernel: [ 981.790833] [0000087000000f48] pgd=0000000000000000 kernel: [ 981.793875] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.797006] Internal error: Oops: 96000004 [#1] SMP kernel: [ 981.797010] Modules linked in: xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_nat xt_CHECKSUM iptable_nat xt_MASQUERADE nf_nat iptable_filter fuse dm_multipath dm_mod al5e al5d allegro xlnx_vcu_clk xlnx_vcu xilinx_hdmi_tx xilinx_hdmi_rx xlnx_vcu_core dp159 xilinx_vphy lm63 ina2xx_adc mali dmaproxy nfsd zocl kernel: [ 981.805628] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.808485] CPU: 1 PID: 3044 Comm: stress-ng-dev Not tainted 5.4.0-1019-xilinx-zynqmp #22-Ubuntu kernel: [ 981.808487] Hardware name: ZynqMP ZCU106 RevA (DT) kernel: [ 981.808491] pstate: 00400005 (nzcv daif +PAN -UAO) kernel: [ 981.812321] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.815269] pc : __mutex_lock.isra.0+0x170/0x510 kernel: [ 981.815273] lr : __mutex_lock_slowpath+0x28/0x38 kernel: [ 981.815276] sp : ffff800017c3bb30 kernel: [ 981.821772] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.826563] x29: ffff800017c3bb30 x28: ffff00083460ec00 kernel: [ 981.826567] x27: 0000ffffb3f2f000 x26: ffff000855fda500 kernel: [ 981.826571] x25: 0000000000000000 x24: ffff0008498fd400 kernel: [ 981.826574] x23: 0000000000000031 x22: ffff000875878750 kernel: [ 981.826578] x21: 0000000000000002 x20: ffff0008385d4e40 kernel: [ 981.835222] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.840035] x19: ffff0008758787f0 x18: 0000000000000000 kernel: [ 981.840039] x17: 0000000000000000 x16: 0000000000000000 kernel: [ 981.840042] x15: 0000000000000000 x14: 0000000000000000 kernel: [ 981.840046] x13: 0000000000000000 x12: 0000000000000000 kernel: [ 981.868428] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.875905] x11: 0000000000000000 x10: 0000000000100000 kernel: [ 981.875909] x9 : 00000000000000fb x8 : 0000000010044400 kernel: [ 981.875912] x7 : 0000000000000000 x6 : ffff00083460e0c0 kernel: [ 981.875915] x5 : 0000000000000015 x4 : 0000000000000014 kernel: [ 981.875919] x3 : 0000087000000f00 x2 : ffff0008385d4e40 kernel: [ 981.875922] x1 : 0000087000000f00 x0 : 0000087000000f00 kernel: [ 981.875926] Call trace: kernel: [ 981.875933] __mutex_lock.isra.0+0x170/0x510 kernel: [ 981.875939] __mutex_lock_slowpath+0x28/0x38 kernel: [ 981.885784] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.889485] mutex_lock+0x48/0x58 kernel: [ 981.889491] xm2msc_mmap+0x38/0x68 kernel: [ 981.889497] v4l2_mmap+0x7c/0xb8 kernel: [ 981.889504] mmap_region+0x364/0x5b0 kernel: [ 981.889511] do_mmap+0x294/0x478 kernel: [ 981.894358] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.902880] vm_mmap_pgoff+0xf4/0x120 kernel: [ 981.902885] ksys_mmap_pgoff+0x1ac/0x240 kernel: [ 981.902891] __arm64_sys_mmap+0x38/0x50 kernel: [ 981.902897] el0_svc_common.constprop.0+0x78/0x180 kernel: [ 981.902903] el0_svc_handler+0x84/0xa0 Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ... kernel:[ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801) kernel: [ 981.907665] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1 kernel: [ 981.912107] el0_svc+0x8/0x1c0 kernel: [ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801) kernel: [ 981.912121] ---[ end trace bab66edb32cbb4db ]--- This issue can only be reproduced on ZCU106, it will cause some leftover processes running and eventually cause the jenkins job hang. stress-ng with commit 91ec6bccd7 (V0.15.00)  stress-ng: invoked with './stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root'  stress-ng: system: '202008-28164-ZCU106' Linux 5.4.0-1019-xilinx-zynqmp #22-Ubuntu SMP Thu Nov 17 05:04:22 UTC 2022 aarch64  stress-ng: memory (MB): total 3929.76, free 2479.07, shared 4.30, buffer 59.98, swap 0.00, free swap 0.00  stress-ng: info: [3037] setting to a 5 second run per stressor  stress-ng: info: [3037] dispatching hogs: 4 dev  kernel: [ 981.702313] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created  kernel: [ 981.702829] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released  kernel: [ 981.708039] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created  kernel: [ 981.708569] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released  kernel: [ 981.709027] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created  kernel: [ 981.709501] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released  kernel: [ 981.734320] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created  kernel: [ 981.734859] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance released Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ...  kernel:[ 981.797006] Internal error: Oops: 96000004 [#1] SMP  kernel: [ 981.768878] xilinx-multiscaler a00e0000.v_multi: Channel 0 instance created  kernel: [ 981.768958] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.768961] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000087000000f48  kernel: [ 981.768966] Mem abort info:  kernel: [ 981.779704] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.782475] ESR = 0x96000004  kernel: [ 981.782478] EC = 0x25: DABT (current EL), IL = 32 bits  kernel: [ 981.782480] SET = 0, FnV = 0  kernel: [ 981.782484] EA = 0, S1PTW = 0  kernel: [ 981.785524] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.790822] Data abort info:  kernel: [ 981.790824] ISV = 0, ISS = 0x00000004  kernel: [ 981.790826] CM = 0, WnR = 0  kernel: [ 981.790830] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000838768000  kernel: [ 981.790833] [0000087000000f48] pgd=0000000000000000  kernel: [ 981.793875] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.797006] Internal error: Oops: 96000004 [#1] SMP  kernel: [ 981.797010] Modules linked in: xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_nat xt_CHECKSUM iptable_nat xt_MASQUERADE nf_nat iptable_filter fuse dm_multipath dm_mod al5e al5d allegro xlnx_vcu_clk xlnx_vcu xilinx_hdmi_tx xilinx_hdmi_rx xlnx_vcu_core dp159 xilinx_vphy lm63 ina2xx_adc mali dmaproxy nfsd zocl  kernel: [ 981.805628] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.808485] CPU: 1 PID: 3044 Comm: stress-ng-dev Not tainted 5.4.0-1019-xilinx-zynqmp #22-Ubuntu  kernel: [ 981.808487] Hardware name: ZynqMP ZCU106 RevA (DT)  kernel: [ 981.808491] pstate: 00400005 (nzcv daif +PAN -UAO)  kernel: [ 981.812321] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.815269] pc : __mutex_lock.isra.0+0x170/0x510  kernel: [ 981.815273] lr : __mutex_lock_slowpath+0x28/0x38  kernel: [ 981.815276] sp : ffff800017c3bb30  kernel: [ 981.821772] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.826563] x29: ffff800017c3bb30 x28: ffff00083460ec00  kernel: [ 981.826567] x27: 0000ffffb3f2f000 x26: ffff000855fda500  kernel: [ 981.826571] x25: 0000000000000000 x24: ffff0008498fd400  kernel: [ 981.826574] x23: 0000000000000031 x22: ffff000875878750  kernel: [ 981.826578] x21: 0000000000000002 x20: ffff0008385d4e40  kernel: [ 981.835222] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.840035] x19: ffff0008758787f0 x18: 0000000000000000  kernel: [ 981.840039] x17: 0000000000000000 x16: 0000000000000000  kernel: [ 981.840042] x15: 0000000000000000 x14: 0000000000000000  kernel: [ 981.840046] x13: 0000000000000000 x12: 0000000000000000  kernel: [ 981.868428] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.875905] x11: 0000000000000000 x10: 0000000000100000  kernel: [ 981.875909] x9 : 00000000000000fb x8 : 0000000010044400  kernel: [ 981.875912] x7 : 0000000000000000 x6 : ffff00083460e0c0  kernel: [ 981.875915] x5 : 0000000000000015 x4 : 0000000000000014  kernel: [ 981.875919] x3 : 0000087000000f00 x2 : ffff0008385d4e40  kernel: [ 981.875922] x1 : 0000087000000f00 x0 : 0000087000000f00  kernel: [ 981.875926] Call trace:  kernel: [ 981.875933] __mutex_lock.isra.0+0x170/0x510  kernel: [ 981.875939] __mutex_lock_slowpath+0x28/0x38  kernel: [ 981.885784] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.889485] mutex_lock+0x48/0x58  kernel: [ 981.889491] xm2msc_mmap+0x38/0x68  kernel: [ 981.889497] v4l2_mmap+0x7c/0xb8  kernel: [ 981.889504] mmap_region+0x364/0x5b0  kernel: [ 981.889511] do_mmap+0x294/0x478  kernel: [ 981.894358] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.902880] vm_mmap_pgoff+0xf4/0x120  kernel: [ 981.902885] ksys_mmap_pgoff+0x1ac/0x240  kernel: [ 981.902891] __arm64_sys_mmap+0x38/0x50  kernel: [ 981.902897] el0_svc_common.constprop.0+0x78/0x180  kernel: [ 981.902903] el0_svc_handler+0x84/0xa0 Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ...  kernel:[ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801)  kernel: [ 981.907665] xilinx-multiscaler a00e0000.v_multi: xm2msc_open Chan already opened for minor = 1  kernel: [ 981.912107] el0_svc+0x8/0x1c0  kernel: [ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801)  kernel: [ 981.912121] ---[ end trace bab66edb32cbb4db ]--- Here is the output when running this test: $ time sudo ./stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable stress-ng: debug: [3037] invoked with './stress-ng -v -t 5 --dev 4 --dev-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root' stress-ng: debug: [3037] stress-ng 0.15.00 g91ec6bccd7e9 stress-ng: debug: [3037] system: Linux 202008-28164-ZCU106 5.4.0-1019-xilinx-zynqmp #22-Ubuntu SMP Thu Nov 17 05:04:22 UTC 2022 aarch64 stress-ng: debug: [3037] RAM total: 3.8G, RAM free: 2.4G, swap free: 0.0 stress-ng: debug: [3037] temporary file path: '.', filesystem type: ext2 stress-ng: debug: [3037] 4 processors online, 4 processors configured stress-ng: info: [3037] setting to a 5 second run per stressor stress-ng: info: [3037] dispatching hogs: 4 dev stress-ng: debug: [3037] cache allocate: using defaults, cannot determine cache level details stress-ng: debug: [3037] cache allocate: shared cache buffer size: 2048K stress-ng: debug: [3037] starting stressors stress-ng: debug: [3039] dev: started [3039] (instance 0) stress-ng: debug: [3040] dev: started [3040] (instance 1) stress-ng: debug: [3037] 4 stressors started stress-ng: debug: [3041] dev: started [3041] (instance 2) stress-ng: debug: [3042] dev: started [3042] (instance 3) Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ... kernel:[ 981.797006] Internal error: Oops: 96000004 [#1] SMP Message from syslogd@202008-28164-ZCU106 at Dec 5 05:11:01 ... kernel:[ 981.912115] Code: a94153f3 a9425bf5 a8c97bfd d65f03c0 (b9404801) stress-ng: debug: [3042] dev: exited [3042] (instance 3) stress-ng: debug: [3041] dev: exited [3041] (instance 2) stress-ng: info: [3039] dev: 19 of 383 devices opened and exercised stress-ng: debug: [3039] dev: exited [3039] (instance 0) stress-ng: debug: [3037] process [3039] terminated (hung here) You can see process 3040 did not exit here. strace output: $ sudo strace -p 3040 strace: Process 3040 attached wait4(3044, 0xffffda2c3214, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- getpid() = 3040 setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}) = 0 rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call) kill(3044, SIGALRM) = 0 kill(3044, SIGKILL) = 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, {tv_sec=0, tv_nsec=989179}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- getpid() = 3040 setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}) = 0 rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call) wait4(3044, 0xffffda2c3214, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- getpid() = 3040 setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=1, tv_usec=0}}, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}) = 0 rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call) kill(3044, SIGALRM) = 0 kill(3044, SIGKILL) = 0 clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, {tv_sec=0, tv_nsec=505466}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) (repeats)
2022-12-05 08:01:25 Po-Hsu Lin bug task added linux-xilinx-zynqmp (Ubuntu)
2022-12-05 08:01:33 Po-Hsu Lin nominated for series Ubuntu Focal
2022-12-05 08:01:33 Po-Hsu Lin bug task added linux-xilinx-zynqmp (Ubuntu Focal)