Comment 8 for bug 1385755

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote : Re: [Bug 1385755] Re: ubuntu kernel cpu dump for node-1 only

> This stack trace indicates that path was too deep, so trace happened on buffer overflow.
 <http://lxr.free-electrons.com/source/fs/dcache.c?v=3.11#L2557>This
warning happens only if path->dentry references the root of the
filesystem:
2558 <http://lxr.free-electrons.com/source/fs/dcache.c?v=3.11#L2558>
      if (IS_ROOT
<http://lxr.free-electrons.com/ident?v=3.11;i=IS_ROOT>(dentry
<http://lxr.free-electrons.com/ident?v=3.11;i=dentry>) &&2559
<http://lxr.free-electrons.com/source/fs/dcache.c?v=3.11#L2559>
     (dentry <http://lxr.free-electrons.com/ident?v=3.11;i=dentry>->d_name.len
<http://lxr.free-electrons.com/ident?v=3.11;i=len> != 1 || dentry
<http://lxr.free-electrons.com/ident?v=3.11;i=dentry>->d_name.name
<http://lxr.free-electrons.com/ident?v=3.11;i=name>[0] != *'/'*))
{2560 <http://lxr.free-electrons.com/source/fs/dcache.c?v=3.11#L2560>
               WARN
<http://lxr.free-electrons.com/ident?v=3.11;i=WARN>(1, *"Root dentry
has weird name <%.*s>\n"*,2561
<http://lxr.free-electrons.com/source/fs/dcache.c?v=3.11#L2561>
              (int) dentry
<http://lxr.free-electrons.com/ident?v=3.11;i=dentry>->d_name.len
<http://lxr.free-electrons.com/ident?v=3.11;i=len>, dentry
<http://lxr.free-electrons.com/ident?v=3.11;i=dentry>->d_name.name
<http://lxr.free-electrons.com/ident?v=3.11;i=name>);2562
<http://lxr.free-electrons.com/source/fs/dcache.c?v=3.11#L2562>
 }

The warning might indicate filesystem problem (say, NFS failed to read
its dentries
due to network being down).

On the other hand there are no warnings about buffer being too short
(which is pretty
normal situation), instead the corresponding error code is returned.

On Wed, Oct 29, 2014 at 2:13 PM, Sergii Golovatiuk <<email address hidden>
> wrote:

> According to kernel source, this function makes buffer with whole path.
> This stack trace indicates that path was too deep, so trace happened on
> buffer overflow. Additionally, this stack trace happens on controllers
> with mongodb. To mitigate it, I suggest
>
> 1. Try to deploy 100 nodes without mongo
> 2. Upgrade mongodb to 2.6.5 where it has redesigned file system
> algorithms. Also it has many bug fixes related to file system
>
> --
> You received this bug notification because you are a member of MOS
> Linux, which is a bug assignee.
> https://bugs.launchpad.net/bugs/1385755
>
> Title:
> ubuntu kernel cpu dump for node-1 only
>
> Status in Fuel: OpenStack installer that works:
> Confirmed
>
> Bug description:
> 5.1.1
> 1 fuel node
> 3 controller nodes + GRE on Dell 6220 chassis
> 2 Compute Nodes Dell R620.
>
> The kernel dump i think happend during a mass creation of 90 instances
> in the same time and which completed successfully and they are still
> running just fine.
>
> /var/log/kern.log of node-1.
> The errors exists many many times on the kern.log with the same cpu: 19
> PID
>
> =================================================================================
> [13359.264100] ------------[ cut here ]------------
> [13359.264104] WARNING: CPU: 19 PID: 55147 at
> /build/buildd/linux-lts-saucy-3.11.0/fs/dcache.c:2561
> prepend_path+0x1d4/0x1e0()
> [13359.264104] Root dentry has weird name <>
> [13359.264105] Modules linked in: xt_nat xt_conntrack xt_REDIRECT
> ipt_REJECT xt_tcpudp ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat
> iptable_mangle xt_mark veth xt_state xt_CT iptable_raw xt_comment
> xt_multiport ip6table_filter ip6_tables iptable_filter ip_tables x_tables
> openvswitch(OF) ttm drm_kms_helper drm sysimgblt mei_me sysfillrect
> syscopyarea mei wmi joydev dcdbas shpchp ioatdma lpc_ich mac_hid
> nf_conntrack_ipv6 acpi_pad nf_defrag_ipv6 nf_conntrack_ipv4 nf_conntrack
> nf_defrag_ipv4 lp parport hid_generic usbhid hid ixgbe mpt2sas igb
> scsi_transport_sas raid_class ahci dca libahci i2c_algo_bit ptp mdio
> pps_core
> [13359.264137] CPU: 19 PID: 55147 Comm: lsof Tainted: GF W O
> 3.11.0-18-generic #32~precise1-Ubuntu
> [13359.264138] Hardware name: Dell Inc. PowerEdge C6220 II/09N44V, BIOS
> 2.4.2 04/15/2014
> [13359.264139] 0000000000000a01 ffff880849167d48 ffffffff8173d60f
> 0000000000000007
> [13359.264142] ffff880849167d98 ffff880849167d88 ffffffff8106540c
> ffff880849167df8
> [13359.264144] ffff880849167eb8 0000000000000000 ffff880848ea9340
> ffff880465469320
> [13359.264147] Call Trace:
> [13359.264150] [<ffffffff8173d60f>] dump_stack+0x46/0x58
> [13359.264152] [<ffffffff8106540c>] warn_slowpath_common+0x8c/0xc0
> [13359.264154] [<ffffffff810654f6>] warn_slowpath_fmt+0x46/0x50
> [13359.264156] [<ffffffff811c9c24>] prepend_path+0x1d4/0x1e0
> [13359.264158] [<ffffffff811cb5a6>] d_path+0xf6/0x180
> [13359.264160] [<ffffffff8121c6cb>] proc_pid_readlink+0x9b/0xf0
> [13359.264162] [<ffffffff811ba1d6>] SyS_readlinkat+0xf6/0x120
> [13359.264164] [<ffffffff811ba21b>] SyS_readlink+0x1b/0x20
> [13359.264167] [<ffffffff817521dd>] system_call_fastpath+0x1a/0x1f
> [13359.264168] ---[ end trace 4c606bbadaee0c69 ]---
> [13359.264870] ------------[ cut here ]------------
>
>
> root@node-1:~# crm status
> Last updated: Sat Oct 25 21:17:14 2014
> Last change: Sat Oct 25 21:17:01 2014 via crm_attribute on node-5
> Stack: classic openais (with plugin)
> Current DC: node-1 - partition with quorum
> Version: 1.1.10-42f2063
> 3 Nodes configured, 3 expected votes
> 22 Resources configured
>
>
> Online: [ node-1 node-3 node-5 ]
>
> vip__management_old (ocf::mirantis:ns_IPaddr2): Started node-1
> vip__public_old (ocf::mirantis:ns_IPaddr2): Started node-1
> p_ceilometer-alarm-evaluator
> (ocf::mirantis:ceilometer-alarm-evaluator): Started node-5
> p_ceilometer-agent-central (ocf::mirantis:ceilometer-agent-central):
> Started node-1
> Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
> Masters: [ node-1 ]
> Slaves: [ node-3 node-5 ]
> Clone Set: clone_p_mysql [p_mysql]
> Started: [ node-1 node-3 node-5 ]
> Clone Set: clone_p_haproxy [p_haproxy]
> Started: [ node-1 node-3 node-5 ]
> p_heat-engine (ocf::mirantis:heat-engine): Started node-1
> Clone Set: clone_p_neutron-plugin-openvswitch-agent
> [p_neutron-plugin-openvswitch-agent]
> Started: [ node-1 node-3 node-5 ]
> Clone Set: clone_p_neutron-metadata-agent [p_neutron-metadata-agent]
> Started: [ node-1 node-3 node-5 ]
> p_neutron-dhcp-agent (ocf::mirantis:neutron-agent-dhcp): Started
> node-5
> p_neutron-l3-agent (ocf::mirantis:neutron-agent-l3): Started
> node-1
>
> controller specs:
> ===============
> vendor_id : GenuineIntel
> cpu family : 6
> model : 62
> model name : Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
> stepping : 4
> microcode : 0x424
> cpu MHz : 2600.065
> cache size : 15360 KB
> physical id : 1
> siblings : 12
> core id : 5
> cpu cores : 6
> apicid : 43
> initial apicid : 43
> fpu : yes
> fpu_exception : yes
> cpuid level : 13
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
> pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
> xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
> ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic
> popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
> xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep
> erms
> bogomips : 5201.75
> clflush size : 64
> cache_alignment : 64
> address sizes : 46 bits physical, 48 bits virtual
> power management:
>
> Memory: 32 GB
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/fuel/+bug/1385755/+subscriptions
>