contrail vrouter compute nodes shutdown abruptly with kernel call trace
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R2.21.x |
Fix Committed
|
Undecided
|
Unassigned | |||
R3.0 |
Fix Committed
|
Undecided
|
Unassigned | |||
R3.0.3.x |
New
|
Undecided
|
Unassigned | |||
R3.1 |
Fix Committed
|
High
|
Unassigned | |||
R3.2 |
Fix Committed
|
Undecided
|
Unassigned | |||
R3.2.3.x |
Fix Committed
|
Undecided
|
Unassigned | |||
Trunk |
Fix Committed
|
Undecided
|
Unassigned |
Bug Description
Sudden shutdown on few of the compute nodes in production setup.
Contrail Version is V2.21.2 Build 55
A Kernel Call trace was output in the customer's environment, causing Compute Node (kw1ap-vscp0107n) to shutdown.
Below is the time when shutdown occurred.
- kw1ap-vscp0107n
And there was additional information from the customer.
To analyze another problem, the following vRouter commands was executed at intervals of 10 seconds. (between 18/1/2017 and 15/2/2017)
- vrfstats --dump
- dropstats
- vif --list
During this period 12 Compute Nodes units got shutdown.
Among them, it is said that five units Kernel Call Traces were output and sudden shutdown occurred.
After Feb 15 customer's stopped executing the command, and only issue occurred kw1ap-vscp0107n
Ubuntu 14.04 : 3 units
Redhat 7.1 : 2 units
Below is the time when shutdown occurred.
- kw1bp-vscp0038n
- kw1ap-vscp0031n
- kw1ap-vscp0027n
- kw1ap-vscp0090n
- os5ap-vscp0023n
It seems that ComputeNode will become unstable by periodically executing the vrouter analysis command from the Customer's analysis result.
Customer confirmed there is no crash files(/var/crashes) from the compute nodes where this issue happened.
I checked /var/log/messages from one of the compute node kw1bp-vscp0038n
Feb 10 11:10:46 kw1bp-vscp0038n-jp1 kernel: [14735550.241354] WARNING: at lib/list_debug.c:53 __list_
Feb 10 11:10:46 kw1bp-vscp0038n-jp1 kernel: [14735550.241355] list_del corruption, ffffea0012217d6
Feb 10 11:10:46 kw1bp-vscp0038n-jp1 kernel: [14735550.241355] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver vhost_net macvtap macvla
sv3 nfs_acl nfs lockd sunrpc fscache binfmt_misc mptctl mptbase sg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_
f_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ipt_REJECT iptable_filter ip_tables tun bridge vrouter(OF) 8021q garp stp mrp llc bonding vfa
owerclamp coretemp iTCO_wdt iTCO_vendor_support sr_mod cdrom ipmi_devintf intel_rapl kvm_intel kvm crct10dif_pclmul crc32_pclmul cr
sni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr sb_edac edac_core lpc_ich i2c_i801 mfd_core hpilo hpwdt ioatdma
cpufreq acpi_power_meter dm_mirror dm_region_hash dm_log dm_mod xfs libcrc32c usb_storage sd_mod crc_t10dif crct10dif
lt i2c_algo_bit drm_kms_helper ttm ixgbe drm tg3 mdio i2c_core dca ptp hpsa pps_core
Feb 10 11:10:46 kw1bp-vscp0038n-jp1 kernel: [14735550.241386] CPU: 6 PID: 31894 Comm: skipcpio Tainted: GF B W O-------------- 3.10.0-
Feb 10 11:10:46 kw1bp-vscp0038n-jp1 kernel: [14735550.241387] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
Feb 10 11:10:46 kw1bp-vscp0038n-jp1 kernel: [14735550.241387] ffff881186493a80 0000000060fee02e ffff881186493a38 ffffffff81603f36
Logs are in root@10219.48.123, pwd:Jtaclab123
Logs upload path: /home/kannan/
Latest update from customer:
current status:
The issue has been occurred more than once even after stopped executing vRouter commands.
However, the core or crash file has been not generated.
The customer is considering investigation from the OS side.
About investigating vRouter side, customer understand that analysis is difficult if there is no crash file, but they want us to continue the investigation on the vrouter side in parallel?
These files are "/var/log/contrail" and "vrouter --info" of each nodes which the issue occurred recently.
- 20170412_
- 20170412_
- 20170412_
- 20170412_
About each nodes timestamp of the issue are following.
kw1bp-vscp0039n
kw1bp-vscp0055n
kw1bp-vscp0058n
kw1ap-vscp0010n
-------
<kw1bp-
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.570279] BUG: Bad page state in process vhost-5576 pfn:1fc5bbe
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.600095] page:ffffea007f
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.638139] page flags: 0x2fffff00000000()
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.658229] page dumped because: nonzero _count
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680159] Modules linked in: fuse btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver vhost_net macvtap macvlan softdog nfsv3 nfs_acl nfs lockd sunrpc fscache binfmt_misc mptctl mptbase sg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680243] CPU: 0 PID: 5587 Comm: vhost-5576 Tainted: GF B O-------------- 3.10.0-
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680245] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680247] ffffea007f16ef80 00000000b93cafda ffff8802e2d63988 ffffffff81603f36
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680253] ffff8802e2d639b0 ffffffff815ff178 ffff881fffa169e0 0000000000003639
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680258] 0000000000000001 ffff8802e2d63ab8 ffffffff8115fe68 ffff8802e2d639e8
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680264] Call Trace:
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680274] [<ffffffff81603
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680278] [<ffffffff815ff
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680284] [<ffffffff8115f
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680290] [<ffffffff810a0
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680294] [<ffffffff81160
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680320] [<ffffffffa0471
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680334] [<ffffffffa0472
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680340] [<ffffffff814eb
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680343] [<ffffffff814ec
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680348] [<ffffffff8119e
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680351] [<ffffffff814e6
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680366] [<ffffffff810a0
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680372] [<ffffffffa02bc
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680377] [<ffffffffa02bc
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680382] [<ffffffffa0558
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680387] [<ffffffffa0558
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680391] [<ffffffffa0555
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680395] [<ffffffffa0555
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680400] [<ffffffff81097
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680404] [<ffffffff81097
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680409] [<ffffffff81613
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.680412] [<ffffffff81097
Mar 24 03:51:32 kw1bp-vscp0039n-jp1 sh[1120]: abrt-dump-oops: Found oopses: 1
<snip>
Mar 24 03:54:32 kw1bp-vscp0039n-jp1 logger: os-prober: debug: /dev/sda4: is active swap
===> shutdown after this log
-------
<kw1bp-
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.858764] BUG: Bad page state in process qemu-system-x86 pfn:188b45f
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.894628] page:ffffea0062
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.937382] page flags: 0x2ffff0000000000()
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961859] Modules linked in: vhost_net vhost macvtap macvlan rpcsec_gss_krb5 nfsv4 softdog nfsv3 mptctl mptbase ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961912] CPU: 0 PID: 12824 Comm: qemu-system-x86 Tainted: G OX 3.13.0-40-generic #69-Ubuntu
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961914] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/13/2016
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961915] 0000000000003639 ffff881fffa03ad0 ffffffff8171f226 ffffea00622d17c0
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961919] ffff881fffa03ae8 ffffffff81719d40 0000000000000000 ffff881fffa03bc8
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961922] ffffffff811584ca ffff88207fffce08 0000000200000000 ffff881fffa03b40
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961925] Call Trace:
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961926] <IRQ> [<ffffffff8171f
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961936] [<ffffffff81719
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961940] [<ffffffff81158
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961945] [<ffffffff81524
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961947] [<ffffffff81158
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961951] [<ffffffff8161c
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961984] [<ffffffffa02f9
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961988] [<ffffffff81197
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961993] [<ffffffff81612
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.961996] [<ffffffff81614
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962006] [<ffffffffa0132
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962011] [<ffffffffa0134
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962016] [<ffffffff81623
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962020] [<ffffffff8106c
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962022] [<ffffffff8106d
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962026] [<ffffffff81732
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962031] [<ffffffffa1ee1
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962034] [<ffffffff81727
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962035] <EOI> [<ffffffff81730
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962041] [<ffffffffa1ee1
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962055] [<ffffffffa0462
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962060] [<ffffffffa1ee6
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962072] [<ffffffffa0481
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962080] [<ffffffffa0467
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962087] [<ffffffffa0451
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962090] [<ffffffff8172b
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962094] [<ffffffff81182
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962098] [<ffffffff811d0
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962106] [<ffffffffa045c
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962109] [<ffffffff811d0
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962111] [<ffffffff8172f
Mar 18 17:30:54 kw1ap-vscp0055n-jp1 kernel: [970355.962112] Disabling lock debugging due to kernel taint
===> shutdown after this log
-------
<kw1bp-
Mar 15 05:19:08 kw1ap-vscp0058n-jp1 kernel: [17631790.061221] BUG: Bad page state in process cat pfn:4785ba
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.087547] page:ffffea0011
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.125715] page flags: 0x2ffff0000000000()
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146040] Modules linked in: rpcsec_gss_krb5 nfsv4 vhost_net vhost macvtap macvlan softdog nfsv3 ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp bridge mptctl mptbase ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146090] CPU: 28 PID: 30443 Comm: cat Tainted: G B OX 3.13.0-40-generic #69-Ubuntu
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146091] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146093] ffff881fffc17830 ffff8816cbe1db80 ffffffff8171f226 ffffea0011e16e80
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146096] ffff8816cbe1db98 ffffffff81719d40 0000000000000001 ffff8816cbe1dc78
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146111] ffffffff811584ca ffff88207fffce08 0000000200000000 ffff880a9a30b000
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146114] Call Trace:
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146120] [<ffffffff8171f
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146124] [<ffffffff81719
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146140] [<ffffffff81158
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146148] [<ffffffff81158
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146165] [<ffffffff810a3
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146167] [<ffffffff8109d
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146172] [<ffffffff81012
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146181] [<ffffffff81197
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146190] [<ffffffff811c6
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146197] [<ffffffff811bd
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146199] [<ffffffff811bd
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146201] [<ffffffff811bd
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146206] [<ffffffff811be
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146209] [<ffffffff8172f
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.146554] BUG: Bad page state in process cat pfn:fffffe787ff
Mar 15 05:19:09 kw1ap-vscp0058n-jp1 kernel: [17631790.177003] page:ffff881fff
Mar 15 05:19:10 kw1ap-vscp0058n-jp1 snmpd[11189]: Cannot statfs /var/lib/
Mar 15 05:19:10 kw1ap-vscp0058n-jp1 snmpd[11189]: Cannot statfs /var/lib/
===> shutdown after this log
-------
<kw1ap-
- Host
kw1ap-vscp0010n
- OS
Ubuntu 14.04 LTS/Linux 3.13.0-40-generic x86_64
- timestamp
Apr 12 05:03:28 Call Trace occur
Apr 12 05:14:52 start of collecting kdump (* it was started about 10 minutes after call trace was occurred.)
Apr 12 05:38:12 end of collecting kdump
- kernel dump
dump.201704120524
- Call Trace
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.382254] BUG: Bad page state in process swapper/3 pfn:1a84c1
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.429096] page:ffffea0006
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.481189] page flags: 0x2ffff0000000000()
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514108] Modules linked in: vrouter(OX) nfsv3 vhost_net vhost macvtap macvlan softdog rpcsec_gss_krb5 nfsv4 mptctl mptbase ip6table_filter ip6_ tables iptable_filter ip_tables ebtable_nat ebtables x_tables cpuid x86_pkg_
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514150] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OX 3.13.0-40-generic #69-Ubuntu
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514152] Hardware name: HP ProLiant DL360 Gen9, BIOS P89 07/20/2015
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514153] 0000000000003639 ffff881fffa63ad0 ffffffff8171f226 ffffea0006a13040
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514157] ffff881fffa63ae8 ffffffff81719d40 0000000000000000 ffff881fffa63bc8
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514160] ffffffff811584ca ffff88207fffce08 0000000200000000 ffff883fd09c9868
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514163] Call Trace:
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514164] <IRQ> [<ffffffff8171f
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514174] [<ffffffff81719
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514178] [<ffffffff81158
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514180] [<ffffffff81158
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514185] [<ffffffff8161c
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514194] [<ffffffffa03f7
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514197] [<ffffffff81197
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514202] [<ffffffff81612
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514205] [<ffffffff81614
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514214] [<ffffffffa0120
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514219] [<ffffffffa0122
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514223] [<ffffffff81623
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514227] [<ffffffff8106c
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514229] [<ffffffff8106d
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514233] [<ffffffff81732
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514236] [<ffffffff81727
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514237] <EOI> [<ffffffff815d1
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514243] [<ffffffff815d1
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514248] [<ffffffff8101d
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514251] [<ffffffff810be
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514255] [<ffffffff81041
Apr 12 05:03:28 kw1ap-vscp0010n-jp1 kernel: [35286576.514256] Disabling lock debugging due to kernel taint
-------
In addition, the customers are upgrading their vrouters sequentially to build 57, but the following versions are mixed at the present time.
2.21.2-36
2.21.3-55
2.21.3-57
However, it seems that the log of "Bad page state" is occurred only in 2.21.2-36.
<log of Bad page state>
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.570279] BUG: Bad page state in process vhost-5576 pfn:1fc5bbe
Mar 24 03:51:31 kw1bp-vscp0039n-jp1 kernel: [18388342.600095] page:ffffea007f
About the log of "Bad page state". The customer is guessing following.
Although the unused memory area was normally "count:0", it was "count:-1" due to "double free". as result log was output.
In Contrail vrouter 2.21, several bugs related to "double free" were hit with searching.
Are there relevance to these?
===
<search result>
https:/
https:/
===
tags: | added: vrouter |
tags: | added: blocker |
information type: | Proprietary → Public |
Review in progress for https:/ /review. opencontrail. org/31048
Submitter: Anand H. Krishnan (<email address hidden>)