NetworkManager increases CPU utilization

Bug #1111926 reported by Ben Collins
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
Invalid
High
Unassigned

Bug Description

Observed in systems if gnome gui is installed. NetworkManager will usually one of the cores to 100%. Killing unused services like avahi and cups didn't seem to help. Restarting the dbus service fixes the issue /etc/init.d/dbus restart

Need to find root cause.

Revision history for this message
Ben Collins (ben-collins) wrote :

This appears to only affect PowerPC (for me). It shows up on my Servergy CTS-1000 system. Original bug report linked.

Revision history for this message
Ben Collins (ben-collins) wrote :

One additional, I can confirm this was occurring since precise. I suspect it may be due to our network devices. Anything we can test, let me know.

Changed in network-manager (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Changed in network-manager:
importance: Unknown → Critical
status: Unknown → Confirmed
Revision history for this message
Ben Collins (ben-collins) wrote :

More information. I enabled log level=DEBUG and I see a continuous stream of these messages:

NetworkManager[8461]: <debug> [1361823070.525214] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.525492] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.525782] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.526063] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.526410] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.526692] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 5 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.526972] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.527251] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.527582] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.527862] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.528142] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.528422] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.528700] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 5 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.528990] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.529266] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.529547] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.529826] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.530105] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.530419] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.530700] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 5 flags 0x11043
NetworkManager[8461]: <debug> [1361823070.530979] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043

Revision history for this message
Ben Collins (ben-collins) wrote :

Additionally, this time with NLDBG=4:

DBG<3>: Attempting to read from 0x107a28f0
DBG<3>: recvmsgs(0x107a28f0): Read 1028 bytes
DBG<3>: recvmsgs(0x107a28f0): Processing valid message...
DBG<2>: msg 0x107ebc98: Allocated new message, maxlen=1028
lt-NetworkManager[29481]: <debug> [1361825388.37820] [nm-netlink-monitor.c:226] event_msg_ready(): TRACE
DBG<4>: Allocated new object 0x107e2270
DBG<4>: Allocated new object 0x107f2018
lt-NetworkManager[29481]: <debug> [1361825388.37972] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 2 flags 0x11043
DBG<4>: Returned object reference 0x107f2018, 0 remaining
DBG<4>: Freed object 0x107f2018
DBG<4>: Returned object reference 0x107e2270, 0 remaining
DBG<4>: Freed object 0x107e2270
DBG<4>: Returned message reference 0x107ebc98, 0 remaining
DBG<2>: msg 0x107ebc98: Freed

The above is one complete sequence of which is repeated over and over, constantly. I assume this means it is getting some nl event from the kernel constantly, which would probably make it a kernel bug rather than a NetworkManager bug. I'll continue check in order to confirm this.

Revision history for this message
Ben Collins (ben-collins) wrote :

This is with NLCB=debug

NetworkManager[31274]: <debug> [1362509777.142825] [nm-netlink-monitor.c:167] link_msg_handler(): netlink link message: iface idx 3 flags 0x11043
DBG<4>: Returned object reference 0x10183048, 0 remaining
DBG<4>: Freed object 0x10183048
DBG<4>: Returned object reference 0x10182d08, 0 remaining
DBG<4>: Freed object 0x10182d08
DBG<4>: Returned message reference 0x101bc300, 0 remaining
DBG<2>: msg 0x101bc300: Freed
DBG<3>: Attempting to read from 0x1017c8f0
DBG<3>: recvmsgs(0x1017c8f0): Read 1028 bytes
DBG<3>: recvmsgs(0x1017c8f0): Processing valid message...
DBG<2>: msg 0x101be338: Allocated new message, maxlen=1028
DBG<4>: Allocated new object 0x10182d08
DBG<4>: Allocated new object 0x10183048

Revision history for this message
Ben Collins (ben-collins) wrote :

This is the output of "ip monitor" with NetworkManager running (when it isn't running, "ip monitor" is fairly quiet)

3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
5: eth6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN
    link/ether c8:bc:c8:ee:68:a4 brd ff:ff:ff:ff:ff:ff
2: eth7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN
    link/ether c8:bc:c8:ee:68:a1 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
5: eth6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN
    link/ether c8:bc:c8:ee:68:a4 brd ff:ff:ff:ff:ff:ff
2: eth7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN
    link/ether c8:bc:c8:ee:68:a1 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
5: eth6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN
    link/ether c8:bc:c8:ee:68:a4 brd ff:ff:ff:ff:ff:ff
2: eth7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UNKNOWN
    link/ether c8:bc:c8:ee:68:a1 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
    link/ether c8:bc:c8:ee:68:a2 brd ff:ff:ff:ff:ff:ff
3: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

I'd love to see the full debug logs for NM. As per our IRC discussion, can you please attach those to the bug as well?

sudo stop network-manager
sudo /usr/sbin/NetworkManager --no-daemon --log-level=debug > nm.log 2>&1

On quick glance, I wonder if it's not that it's trying to bring up the devices with IPv6 automatic, and required for the connection to stay up. It looks like what was happening in those cases, and should be very clear in the logs if it's the cause of the issues, and any other cause should be pretty clear from the NM debug logs as well.

At least from the fact that the behavior shows in ip monitor when NM is running, and doesn't show when it's not running, indicates to me that it's likely a bug in NM.

Changed in network-manager (Ubuntu):
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
Revision history for this message
Ben Collins (ben-collins) wrote :

Here's a link to the log command requested on IRC

http://paste.ubuntu.com/5588747/

Revision history for this message
Ben Collins (ben-collins) wrote :

I believe I've found the issue. When NM sees an interface, it brings it up. It then requests state. If this state is "UNKNOWN" (as opposed to "DOWN" or "UP") it brings the interface up again and requests state again.

This results in an infinite and unrestrained loop.

NetworkManager needs to back-off in this scenario and only poll every so often (like maybe once per second).

Note, this scenario is occurring on an interface that is a fixed 10G (no auto-negotiation). The unknown state is shown when it if is up, but there is no physical link (because the MAC doesn't do AN, it's reports link active, always).

I'm not sure if this state in the ethernet driver is buggy or not. I'm still investigating that. However, NM putting itself into a tight CPU intensive loop is fixable and the right thing to do, either way.

no longer affects: network-manager
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

That means this is probably some issue with carrier detection. I wonder if it's since been fixed in NM?

Changed in network-manager (Ubuntu):
milestone: ubuntu-13.04-beta-1 → none
assignee: Mathieu Trudel-Lapierre (mathieu-tl) → nobody
status: Confirmed → Incomplete
Revision history for this message
Bryan Quigley (bryanquigley) wrote :

This bug was marked Incomplete some time ago and was supposed to autoclose after 90 days. I'm going to close it manually due to no activity (and likely the issue has gone away/been fixed).

Changed in network-manager (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.