Comment 6 for bug 1929923

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2021-09-16 04:22 EDT-------
Hi, we have hit the same problem:

vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332:
...
[645967.289658] Unable to handle kernel pointer dereference in virtual kernel address space
[645967.289665] Failing address: 001ffc004bf14000 TEID: 001ffc004bf14403
[645967.289668] Fault in home space mode while using kernel ASCE.
[645967.289671] AS:00000001d839c00b R2:000000038bbec00b R3:00000003010c0007 S:0000000302260000 P:0000000000000400
[645967.289715] Oops: 0011 ilc:2 [#1] SMP
[645967.289721] Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache 8021q garp mrp stp llc bonding binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey zcrypt s390_trng ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common chsc_sch eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm drm_panel_orientation_quirks i2c_core sunrpc ip_tables x_tables btrfs zstd_compress zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear crc32_vx_s390 zfcp scsi_transport_fc qeth_l2 dasd_eckd_mod dasd_mod qeth qdio ccwgroup [last unloaded: tracedev]
[645967.289791] CPU: 4 PID: 1891047 Comm: kgnrdwr_dvtc2b Kdump: loaded Tainted: G OE 5.4.0-74-generic #83-Ubuntu
[645967.289795] Hardware name: IBM 3906 M05 710 (LPAR)
[645967.289798] Krnl PSW : 0404e00180000000 00000001d73e20ce (try_to_wake_up+0x4e/0x700)
[645967.289809] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[645967.289814] Krnl GPRS: 0000000370d32488 001ffc0000000000 001ffc0000000005 0000000000000003
[645967.289817] 0000000000000000 ffffffff00000005 041ffbff80bcb9e0 0000000000000003
[645967.289858] 0000000000000003 001ffc004bf141bc 0000000000000000 001ffc004bf13878
[645967.289860] 0000000095190000 00000001d7c1aa40 001ffbff80bcba10 001ffbff80bcb990
[645967.289872] Krnl Code: 00000001d73e20c2: 41902944 la %r9,2372(%r2)
00000001d73e20c6: 582003ac l %r2,940
#00000001d73e20ca: a7180000 lhi %r1,0
>00000001d73e20ce: ba129000 cs %r1,%r2,0(%r9)
00000001d73e20d2: a77401c9 brc 7,00000001d73e2464
00000001d73e20d6: e310b0080004 lg %r1,8(%r11)
00000001d73e20dc: b9800018 ngr %r1,%r8
00000001d73e20e0: a774001f brc 7,00000001d73e211e
[645967.289894] Call Trace:
[645967.289899] ([<0000000000000000>] 0x0)
[645967.289906] [<00000001d785c83a>] rq_qos_wake_function+0x8a/0xa0
[645967.289913] [<00000001d74004c2>] __wake_up_common+0xa2/0x1b0
[645967.289915] [<00000001d74009c4>] __wake_up_common_lock+0x94/0xe0
[645967.289918] [<00000001d7400a3a>] __wake_up+0x2a/0x40
[645967.289923] [<00000001d7873870>] wbt_done+0x90/0xe0
[645967.289925] [<00000001d785c942>] __rq_qos_done+0x42/0x60
[645967.289928] [<00000001d78486c0>] blk_mq_free_request+0xe0/0x140
[645967.289949] [<001fffff801bf18a>] dasd_request_done+0x2a/0x40 [dasd_mod]
[645967.289951] [<00000001d7848938>] blk_mq_complete_request+0xb8/0x160
[645967.289957] [<001fffff801c43c8>] dasd_block_tasklet+0x148/0x470 [dasd_mod]
[645967.289962] [<00000001d73b12d2>] tasklet_action_common.isra.0+0x82/0x160
[645967.289968] [<00000001d7c117b4>] __do_softirq+0x104/0x360
[645967.289971] [<00000001d73b1a4e>] irq_exit+0x9e/0xc0
[645967.289974] [<00000001d733cb28>] do_IRQ+0x78/0xb0
[645967.289977] [<00000001d7c10a20>] io_int_handler+0x12c/0x294
[645967.289985] [<001fffff805f3c30>] _DTrace3+0x10/0xb0 [tracedev]
[645967.290048] ([<001fffff80a8d2ca>] gpfs_f_llseek+0x4a/0x280 [mmfslinux])
[645967.290053] [<00000001d75f5ed2>] ksys_lseek+0x92/0xe0
[645967.290055] [<00000001d7c10498>] system_call+0xdc/0x2c8
[645967.290056] Last Breaking-Event-Address:
[645967.290060] [<00000001d73e278e>] wake_up_process+0xe/0x20

If I should upload the data for this bug or open a new please tell.

------- Comment From <email address hidden> 2021-09-16 04:27 EDT-------
dvtc2b-2.gpfs.net: OS: Debian Linux: Ubuntu 20.04.2 LTS => Kernel: 5.4.0-74-generic on s390x

------- Comment From <email address hidden> 2021-09-17 09:36 EDT-------
I double-checked with Canonical regarding the current / in-service release level of 20.04 LTS. The current level is 20.04.3, to be more precise 20.04.3 with Kernel 5.4.0.84. Therefore, you should try to re-produce the bug on the 20.04.3 release.

Apart from the (not supported anymore) 20.04.2 release level, a first look on the error messages strongly hints to a GPFS error. ("vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332")

@Aleksandra: can you please confirm if GPFS is running on the system?

------- Comment From <email address hidden> 2021-09-20 05:29 EDT-------
Yes, gpfs is running on the system.

But what you think it is a error message is actually the name of the file where the dmesg output from the crash is stored, with some data about the machine where the crash was taken and the date.
vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332:
The actual dmesg data start after '...'

Is there something else beside that line that makes you think that this is a gpfs problem?