Comment 4 for bug 1776389

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-06-21 03:33 EDT-------
Verified on 4.15.0-24-generic and adapter recovery happens neatly after error injection with no Oops messages.

[ 3473.707228] EEH: PHB#2 failure detected, location: N/A
[ 3473.707308] CPU: 96 PID: 20922 Comm: lspci Not tainted 4.15.0-24-generic #26-Ubuntu
[ 3473.707310] Call Trace:
[ 3473.707321] [c0002038006fbb00] [c000000000ce04bc] dump_stack+0xb0/0xf4 (unreliable)
[ 3473.707328] [c0002038006fbb40] [c00000000003ade4] eeh_dev_check_failure+0x234/0x5b0
[ 3473.707335] [c0002038006fbbe0] [c0000000000adc58] pnv_pci_read_config+0x128/0x160
[ 3473.707340] [c0002038006fbc20] [c00000000075d1ac] pci_user_read_config_dword+0x8c/0x180
[ 3473.707345] [c0002038006fbc70] [c0000000007722f4] pci_read_config+0x104/0x2d0
[ 3473.707350] [c0002038006fbcf0] [c0000000004a05f0] sysfs_kf_bin_read+0x70/0xd0
[ 3473.707354] [c0002038006fbd10] [c00000000049f540] kernfs_fop_read+0xe0/0x290
[ 3473.707358] [c0002038006fbd60] [c0000000003d517c] __vfs_read+0x3c/0x70
[ 3473.707361] [c0002038006fbd80] [c0000000003d526c] vfs_read+0xbc/0x1b0
[ 3473.707364] [c0002038006fbdd0] [c0000000003d5ae4] SyS_pread64+0xc4/0xf0
[ 3473.707369] [c0002038006fbe30] [c00000000000b284] system_call+0x58/0x6c
[ 3473.707381] EEH: Detected error on PHB#2
[ 3473.707384] EEH: This PCI device has failed 8 times in the last hour
[ 3473.707385] EEH: Notify device drivers to shutdown
[ 3473.707402] ixgbe 0002:01:00.0: Adapter removed
[ 3473.730202] ixgbe 0002:01:00.1: Adapter removed
[ 3473.752641] EEH: Collect temporary log
[ 3473.752644] PHB4 PHB#2 Diag-data (Version: 1)
[ 3473.752645] brdgCtl: 00000002
[ 3473.752649] RootSts: 00060040 00402000 c1010008 00100107 00004000
[ 3473.752651] RootErrSts: 00000024 00000020 00000000
[ 3473.752653] sourceId: 01000000
[ 3473.752655] nFir: 0000800000000000 0030001c00000000 0000800000000000
[ 3473.752657] PhbSts: 0000001c00000000 0000001c00000000
[ 3473.752659] Lem: 1001000104300100 0000000000000000 1000000000000000
[ 3473.752661] PhbErr: 00000da000000000 0000010000000000 2148000098000240 a008400000000000
[ 3473.752664] PhbTxeErr: 0000000600000000 0000000200000000 0000000000000000 0000000000000000
[ 3473.752666] RxeArbErr: 0000100030000020 0000000000000020 4000010000000000 0000000000000000
[ 3473.752668] RxeMrgErr: 0000000000000001 0000000000000001 0000000000000000 0000000000000000
[ 3473.752670] RegbErr: 00d0000000000000 0010000000000000 4800012c00000000 0000000007000000
[ 3473.752673] PE[000] A/B: a700000300000000 8101000001010000
[ 3473.752677] PE[100] A/B: 8000000000003bfe 80000000300c3de9
[ 3473.752680] EEH: Reset without hotplug activity
[ 3477.113186] EEH: Notify device drivers the completion of reset
[ 3477.113197] ixgbe 0002:01:00.0: enabling device (0140 -> 0142)
[ 3477.174161] ixgbe 0002:01:00.0: pci_cleanup_aer_uncorrect_error_status failed 0xffffffea
[ 3477.174239] ixgbe 0002:01:00.1: enabling device (0140 -> 0142)
[ 3477.238148] ixgbe 0002:01:00.1: pci_cleanup_aer_uncorrect_error_status failed 0xffffffea
[ 3477.238220] EEH: Notify device driver to resume
[ 3477.669705] ixgbe 0002:01:00.0 enP2p1s0f0: detected SFP+: 3
[ 3478.037802] ixgbe 0002:01:00.1 enP2p1s0f1: detected SFP+: 4
[ 3478.337233] ixgbe 0002:01:00.0 enP2p1s0f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 3478.705247] ixgbe 0002:01:00.1 enP2p1s0f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX

Thanks Mauro for all your support !