Access points' "PropertiesChanged" dbus signals freeze UI on mobile devices

Bug #1480877 reported by Andrea Bernabei
140
This bug affects 32 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Fix Released
High
John McAleely
dbus-cpp (Ubuntu)
Fix Released
High
Thomas Voß
dbus-cpp (Ubuntu RTM)
Fix Released
Undecided
Unassigned
location-service (Ubuntu RTM)
Fix Released
High
Thomas Voß
network-manager (Ubuntu)
Incomplete
High
Tony Espy
network-manager (Ubuntu RTM)
Incomplete
High
Tony Espy
qtbase-opensource-src (Ubuntu)
Fix Released
High
Tony Espy
qtbase-opensource-src (Ubuntu RTM)
Fix Released
High
Tony Espy
unity8 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Krillin, rc-proposed, r83

DESCRIPTION:
I've been trying to track down the cause of the occasional UI freezes on my Krillin device, and I noticed that whenever the UI freezes for 2-4 seconds, I get a burst of "PropertiesChanged" signals in dbus-monitor

Here's a log of what's shown in dbus-monitor: http://pastebin.ubuntu.com/11992322/

I'd guess the problem is in the code that actually catches the signals and acts accordingly.

HOW TO REPRODUCE:
1) Move to a place where many wifi hotspots are available
2) Connect the device via USB and run "phablet-shell" and then "dbus-monitor"
3) Use the device while keeping an eye on dbus-monitor output

Related branches

CVE References

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager (Ubuntu):
status: New → Confirmed
Revision history for this message
royden (ryts) wrote :
Revision history for this message
Andrea Bernabei (faenil) wrote :

It seems I hit an evolved version of this bug.

My phone is right now unusable, dbus-daemon taking 99% of the cpu, and dbus-monitor shows about 30 signals like this:

signal sender=:1.0 -> dest=(null destination) serial=78512 path=/com/ubuntu/Upstart; interface=com.ubuntu.Upstart0_6; member=EventEmitted
   string "dbus"
   array [
      string "SIGNAL=PropertiesChanged"
      string "BUS=system"
      string "INTERFACE=org.freedesktop.NetworkManager.AccessPoint"
      string "OBJPATH=/org/freedesktop/NetworkManager/AccessPoint/2281"
      string "SENDER=:1.8"
   ]

Every 2-3 seconds I get a new burst of 30-ish signals.

The phone is so slowed down it takes about 15-20 seconds (and some patience) to be able to just swipe away the lockscreen.

I'll try not to reboot the device right now so that we can keep debugging it and hopefully get rid of this beast :)

so if there's anyone in the audience who'd like me to get some more useful logs, please shout! :)

Revision history for this message
Andrea Bernabei (faenil) wrote :

syslog doesn't show anything interesting (at least not in the last 1k-ish lines)

Revision history for this message
Tony Espy (awe) wrote :

@Andrea

A few questions/requests....

1. Please attach your syslog to the bug

2. Are you connected to an access point when the problem occurs?

3. Have you made the phone write-able and installed any packages via apt-get and/or modified the filesystem in any way?

4. Can you recall any changes that occurred between your original report and the new level of problems as mentioned in comment #3?

The current version of NM in our images has a bug which prevents periodic scanning of WiFi access points, so I'm a bit wary to pin this on NM. Also, the DBus message you point out in comment #3 appears to be from Upstart, not NM.

Changed in network-manager (Ubuntu):
status: Confirmed → Incomplete
assignee: nobody → Andrea Bernabei (faenil)
Revision history for this message
Andrea Bernabei (faenil) wrote :
  • syslog Edit (19.4 MiB, application/octet-stream)

1) syslog attached

2) Yes it thinks it is connected to an access point which doesn't actually exist in the office (it's only available at home). The rest of the access points detected in the network indicator are up to date though. To sum up: it thinks it's connected to an access point that doesn't actually exist where I am now, but the rest of the networks it shows are correct.

3) yes

4) I can't think of any change

Revision history for this message
Andrea Bernabei (faenil) wrote :

just a hint while reading syslog, I think the current problem started yesterday, not before lunch.

Revision history for this message
Andrea Bernabei (faenil) wrote :

after more investigation with awe on IRC, I can provide the following info:

here are the packages I have in the apt cache:
http://pastebin.ubuntu.com/12021300/
nothing that could cause this issue, as far as I can tell

More info about the problem:
"nmcli d" reports mobile data as CONNECTED, wifi as
wlan0 wifi connecting (getting IP configuration) Morino

Morino is an AP that doesn't exist in the place where I am right now, it is an access point which was created via Win8.1 at home (infrastructure mode) in an attempt to share the internet connection of a laptop running Win8.

So, it seems NM is stuck trying to connect to an access point which doesn't currently exist.

The network indicator shows the mobile data icon, yet the wifi network lists shows "Morino" (the non-existing AP) as green, which means it's connected (or connecting as well, maybe?)

Revision history for this message
Andrea Bernabei (faenil) wrote :

after reboot, the problem seems to gone (at least for the moment)

Revision history for this message
Andrea Bernabei (faenil) wrote :

"Morino" AP was created using instructions very similar to what's shown in
http://www.melbpc.org.au/pcupdate/3006/3006article8.htm

I'm still not sure the fact that Morino wasn't a "physical AP" makes a difference...but who knows :) better safe than sorry

Tony Espy (awe)
Changed in indicator-network (Ubuntu):
status: New → Incomplete
Revision history for this message
ingo (ingo-randolf) wrote :

hitting the same issue once and a while.
it happened again yesterday. i could record some traffic on dbus:

this pastebin shows the log with high performance impact.
at the end of the log things are getting normal again:
http://pastebin.com/v55umuBF

this pastebin in comparison shows the traffic after things settled and the phone was usable again.
http://pastebin.com/CrQ0Z3K2

both recordings are about 1 minute.

maybe this helps to track it down...

Revision history for this message
Tony Espy (awe) wrote :

@ingo

Thanks for the comment and DBus logs.

In the original bug report, NetworkManager's WiFi device was stuck connecting to an access point which no longer existed. Also as noted, the access point in question was a softAP configured on a machine running Windows8.1. Rebooting, solved his issue.

In order to make further progress on this bug, what we really need is a concrete set of steps to reproduce the issue.

Revision history for this message
Andrea Bernabei (faenil) wrote :

@awe I'm sorry if the two things are mixing up :(

I should probably file another bug for the connection loop...

When I filed this bug the softAP didn't exist yet, so that can't have been the cause.

We have two (related) problems here:
1) when you're in an area with many access points, many PropertiesChanged signals are fired and they slow down the device for 2-3 seconds

2) when NM gets stuck connecting to a network which doesn't exist, PropertiesChanged signals are fired in an endless loop (well, it pauses 2-3 secs every time before firing all signals again)

Revision history for this message
Tony Espy (awe) wrote :

@Andrea

As mentioned on IRC, we still need more information in order to determine what the real problem(s) are.

I did a little more investigation regarding the DBus events you're seeing on the session bus. By default, an upstart bridge is started on the session bus which forwards all signals from the system bus to the session bus. See /usr/share/upstart/sessions/upstart-dbus-system-bridge.conf for details of this job.

Problem 1:

 * We need some way to quantify "many access points". Does "many" mean more than five? ten? Again, without a reproducible scenario, we can't begin to narrow this down... Taking a look at your first pastebin in your description, I only see ~40 access points involved which doesn't seem to be all that many. I routinely see ~25 in my home office.

 * Do you have a location where this problem always occur? Does the problem happen every time you're in this location? Does the condition persist until you leave this location? What happens if you reboot while at the location? What happens if you disable Wi-Fi while at this location?

 * I will attach a script which will dump some information out about NM's Wi-Fi device, including the number of access points that are currently visible to NM. The next time the problem manifests, can you please run the script and add the output to a comment?

 * This next step may break other things on your system, but it's worth a try... If you can reliably reproduce the problem, before entering the area with the excess of APs, please run the following command as the phablet user on the phone:

% stop upstart-dbus-system-bridge

 * When you're done testing, please remember to start the bridge ( replace "stop" with "start" ), or reboot the device.

 * Please note the times when the problem occurs and make sure to attach your syslog. This will allow me to look at what NM was doing at the time the problem happened. If you want, you can reduce the size of the log by using the following command:

$ grep "NetworkManager" syslog > nm.out

...and attach just the nm.out file.

 * Looking at your syslog from comment #6, I see a couple of notable things:

   - at some point, you appear to be connecting to a Wi-Fi WPA-Enterprise network ( eduroam ). Does this correlate to when you see the problem?

   - I also note that around 11:24 on Aug 6, your device appears to have rapidly roamed between a couple of the office APs in Bluefin. This could also be a potential source of the problem.

 * Finally, before doing any re-test, can you ensure you have the latest updates installed? We just released a new version of NM which fixes a problem with scanning. It's probably not related, but you never know...

Problem 2:

Please try to reproduce and file a separate bug for the issue where NM gets stuck trying to connect to the Windows8.1 AP.

Revision history for this message
Tony Espy (awe) wrote :
Revision history for this message
Andrea Bernabei (faenil) wrote :

I'll try to address some questions here, more answers will come in the next comment, sorry about that.

It seems to always happen at least at the office (BlueFin), where your script reports 26 APs.

Yes the problem disappears if I disable WiFi.

After reboot, the problem disappears, at least at the beginning, the number of PropertiesChanged I see in dbus-monitor is much lower. So it seems to be about something which gets worse with time. Even if I didn't move, I'm seeing 1-4 PropertiesChanged at a time instead of 30-ish.

I've been waiting for 10minutes but I still don't have the problem, after reboot.

I'll try disabling the dbus bridge as soon as I get the problem again, which is just a matter of time I guess.

The eduroam is most likely not related.

Yes I have the latest update installed, r94 rc-proposed, krillin.

Revision history for this message
Andrea Bernabei (faenil) wrote :

I went from one side of the office to the other to trigger AP roaming.

That didn't help reproduce the issue, though.

Now I usually get 1 PropertiesChanged for the AP I'm connected to, and from time to time I get 10-ish PropertiesChanged for other APs

Revision history for this message
Andrea Bernabei (faenil) wrote :

One detail I remember:
before reboot (i.e. last time I had this issue) WiFi was ON, but the phone wasn't connected to the WiFi network which it should autoconnect to, it was using mobile data.

Because of https://bugs.launchpad.net/ubuntu/+source/ubuntu-system-settings/+bug/1480864
I can't disconnect from a WiFi network without switching the WiFi off.

Tomorrow I'll try deleting the office WiFi from the known networks (so that I can keep WiFi enabled without being necessarily connected to an AP) and see if that helps reproduce the bug.

Revision history for this message
Andrea Bernabei (faenil) wrote :

I have the bug again.
I'm on rc-proposed r102, krillin, using network manager 0.9.10.0-4ubuntu15.1.7

Some stats:
Every 10-15 seconds I get a burst of 50 PropertiesChanged which lock my phone for about 4 seconds.

So, every 15 seconds, 4 seconds of frozen UI.

"nmcli d" reports WiFi as connected to the office network (as it should be).

this is the "grep NetworkManager /var/log/syslog" since I got to the office (I don't remember having this issue at home this morning)
http://pastebin.ubuntu.com/12123687/

this is the output of "dbus-monitor --system" in one of the 4 seconds freezes.
http://pastebin.ubuntu.com/12123712/

the output of the wifi script is
http://pastebin.ubuntu.com/12123727/

After getting those logs I disabled WiFi and the freezes disappeared (and the flooding in dbus-monitor with them)

Then I reenabled WiFi, it took something like 1 minute before the phone stopped being laggy like hell (while the wifi list was populating in the indicator panel)....AND THE PROBLEM IS BACK :)

Revision history for this message
Andrea Bernabei (faenil) wrote :

PS even after reenabling wifi and finding the office network, the phone did *not* reconnect to it, but that's probably another bug :)

Revision history for this message
Tony Espy (awe) wrote :

@Andrea

Thanks for the updates.

So our theory that this is related to a large number of APs doesn't pan out, as you're able to reproduce this at the office with only 25 access points involved ( per the output of my script ).

Also, I'm pretty sure the problem is related to WiFi roaming. It appears when you first enable WiFi at Bluefin, goes away when WiFi is disabled, and re-appears when WiFi is re-enabled while at Bluefin.

Looking at your syslog output, I see your phone associate with the office network around 10:40:05, however I also see a suspicious message right before the connection activation succeeds:

Aug 19 10:40:05 ubuntu-phablet NetworkManager[1417]: <info> (wlan0): roamed from BSSID 18:33:9D:F8:AA:B0 (Canonical-2.4GHz-g) to (none) ((none))

This doesn't appear to affect your connection though, as there's no subsequent disconnect, and I also see multiple DHCP renewals on wlan0 @ 10:48, 10:56, 11:05, and 11:14.

As for dbus-monitor output, we're really more concerned with the traffic on the session bus. Can you please try disabling the upstart bridge as mentioned in comment #14?

> * This next step may break other things on your system, but it's worth a try... If you can reliably reproduce the problem,
> before entering the area with the excess of APs, please run the following command as the phablet user on the phone:
>
> % stop upstart-dbus-system-bridge
>
> * When you're done testing, please remember to start the bridge ( replace "stop" with "start" ), or reboot the device.

Please note whether or not this makes the stuttering go away.

Can you also try disabling indicator-network ( as phablet, just run 'stop indicator-network ) and re-test too?

Finally, do you have another device ( ie. an arale or mako ) you could compare results against?

I'm actually working on a related bug for BQ, and may have another test package for you to try, but I'd like to see whether disabling the bridge and/or indicator-network has any effect on the issue first.

Revision history for this message
Andrea Bernabei (faenil) wrote :

More details on the situation in the morning:

when I got to the office I waited more than 1h for it to autoconnect to WiFi, but it didn't.
The WiFi connection at 10:40 was forced by me, I went to settings -> Wifi, and tapped on the office network.

About the bridge, can I disable that after I have the bug? I don't have a way to reproduce this bug on demand, so the only way to make sure it's there is to see the stuttering. The alternative is disable the bridge and then check the number of PropertiesChanged on the system bus, to know if the bug is being triggered ...

About the session bus, the output is basically a copy of what's in the system bus, as previous logs show.

Revision history for this message
Andrea Bernabei (faenil) wrote :

New info about some debugging with "awe":

rc-proposed, r102, krillin, nm 0.9.10.0-4ubuntu15.1.7

----I CLOSED ALL APPS, NOTHING IN THE TASK SWITCHER----

After getting home I still have the bug, I enabled and disabled WiFi a couple of times, the problem comes back every time WiFi is enabled.

The frequency, however, is different: I get 3-4secs freezes every 2 mins, which awe says is likely the WiFi scanning interval.

AFTER DISABLING THE DBUS SESSION BRIDGE --> no change

AFTER DISABLING THE NETWORK INDICATOR --> no change

there's still spamming in the system dbus, and the same 3-4secs freeze (maybe a bit less than with bridge and network indicator enabled though)

Revision history for this message
Tony Espy (awe) wrote :

@Andrea

Thanks for the updates. Your stopping of the bridge and indicator are useful clues.

Again, we still have no proof that there's spamming of the bus yet. Is there a lot of traffic, yes? But as pointed out before, the bus is meant to handle lots of traffic. I'm not claiming the bug isn't related to bus traffic, it's just that we haven't managed to pinpoint a smoking gun yet.

It'll be interesting to see if you get the same set of JobAdded/Removed messages on your bus when enabling/disabling WiFi after you re-flash.

Revision history for this message
Andrea Bernabei (faenil) wrote :

@Tony

hey :)

Yeah, I reflashed yesterday (without formatting the user partition).

I'll keep you posted, I think the bug is still there, as I'm seeing 0.5-1secs freezes and bursts of PropertiesChanged signals on the bus (but not as many as before reflashing).

It usually gets worse over time, so let's see how it will behave tomorrow or in a couple of days.

I'm quite sure the problem is still there, just a matter of waiting for it to get annoying enough.

Revision history for this message
Andrea Bernabei (faenil) wrote :

At the moment, for instance,

the frequency is 1sec UI freeze (with the usual PropertiesChanged burst in dbus monitor) every 1 minute (which I guess is the current WiFi scanning interval.

Revision history for this message
Andrea Bernabei (faenil) wrote :

I forgot one detail, the phone is currently not connected to WiFi (it sees office network, but it doesn't auto connect to it). WiFi is enabled, of course.

Revision history for this message
Andrea Bernabei (faenil) wrote :

I'm quite sure I had the same bug this weekend, the phone was really unusable, kept freezing every few seconds, I used the terminal app to see that it was the usual burst of PropertiesChanged.

I don't have stats and logs as I was out, and the phone died before I got home.
I'll have to wait a couple of days more for the phone to become annoyingly slow again.

Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

I hit this issue while traveling in the states and on arrival to Bluefin. The issue went away as soon as I associated to the AP in the office. I have tried forgetting the network and reproducing it and so far I cannot.

Changed in canonical-devices-system-image:
assignee: nobody → John McAleely (john.mcaleely)
importance: Undecided → High
milestone: none → ww40-2015
status: New → Confirmed
tags: added: connectivity
Revision history for this message
Michael Terry (mterry) wrote :

You folks mention the session bus. But when I've seen this bug (like my notes from bug 1480844), it's been the system dbus-daemon that takes 100% CPU.

I attached gdb to it once when this was happening. We seemed to be spending a chunk of time in get_recipients_from_list(), which iterates over every system DBus connection and sees if it is listening for a given signal, in order to pass it on if so. I don't *know* that the actual majority of time was in that function yet, I'm still digging there.

Revision history for this message
Michael Terry (mterry) wrote :

So I added some debugging output to dbus. When it peaked, it spent over a third of the CPU entirely in get_recipients_from_list(). (This is dbus checking if an incoming signal matches any of the rules added by org.freedesktop.DBus.AddMatch -- i.e. signal subscriptions. It does this for every rule, for every connection.)

So we can reduce the time spent there by being more efficient, by reducing the number of subscriptions, or reducing the number of signals.

Looking at which rules it's checking, the majority are subscriptions on NetworkManager. One per access point, one per device, from multiple connections. Could connectivity-cpp be a bit too subscription-greedy?

I'm not sure that get_recipients_from_list is the only (or worst) culprit here. Just something I noticed.

Revision history for this message
Jonas G. Drange (jonas-drange) wrote :

Hotspot management code that subscribes to NM events [1] was recently moved from System Settings into indicator-network. The code is quite comprehensive whenever you have previously created a hotspot. Those affected by the bug should 1) see if they have previously created a hotspot and 2) see if by removing the hotspot, the issue is resolved (sudo rm /etc/NetworkManager/system-connections/<hotspot name> && sudo reboot).

This wasn't an issue when the code was bound to System Setting's life cycle, but it may be an issue for the indicator.

[1] http://bazaar.launchpad.net/~indicator-applet-developers/indicator-network/trunk.15.10/view/head:/src/indicator/nmofono/hotspot-manager.cpp#L690 (This code is in vivid+overlay too).

Revision history for this message
Andrea Bernabei (faenil) wrote :

I have never created a WiFi hotspot yet

Revision history for this message
Jonas G. Drange (jonas-drange) wrote :

Right, after deleting my hotspot I still saw lockups while moving through London.

1 comments hidden view all 135 comments
Revision history for this message
Michael Terry (mterry) wrote :

I've been seeing this for a while, before I even updated to a hotspot-capable image.

Revision history for this message
Tony Espy (awe) wrote :

I installed bustle on my krillin this afternoon to try and glean some more information from the system DBus. I few observations...

1. The top signal on the system bus is:

org.freedesktop.NetworkManager.AccessPoint.PropertiesChanged

2. Most of these seem to be for a specific AP ( instance 0 ), which I'm pretty sure is the current connected AP. The associated property is 'Strength'. This makes sense, as NM constantly reports the varying signal strength of the AP. These are happening all the time, and I typically see them in bursts of two or three signals.

3. As scans typically occur anywhere from 20s to 2m, when they do occur, we see a flurry of AccessPoint.PropertiesChanged signals representing the changing signals strengths of each access point. There also are AccessPoint.PropertiesChanged signals for the 'LastSeen' property. I'm pretty sure this property was added specifically for the LocationService. Note, ideally this signal would be bundled with the associated 'Strength' signal, however due to the way NM's underlying object/property system was architected, this doesn't seem to be an easy change at first glance. Note, these LastSeen signals are consumed by the LocationService's 'connectivity API' ( which I'm pretty sure uses dbus-cpp ).

So... we still need to quantify how many APs trigger the problem, and if it's being triggered whenever scans occur. In comment #29, Pat mentioned that it happened when he initially arrived in Bluefin, but went away once he connected to an AP. As background scanning had recently been restored in NM, I would've assumed he would continue to see the issue each time a scan was occurred, even when connected.

It also would be interesting to see whether disabling location service has any effect on the problem ( if we can reproduce reliably of course ).

Perhaps I can dummy up wpa_supplicant to push an artificial number of APs to NM and see if I can trigger the problem locally.

Revision history for this message
Tony Espy (awe) wrote :

One other interesting tidbit. On my krillin, on average there are roughly ~200 processes running. Looks like almost half (87) of them are connected to the system bus. You can use the following command to see how many sockets exist for any dbus processes, and then you need to figure out which one is the system bus daemon ( usually the one with the lower PID ):

sudo netstat -nap | grep dbus | grep CONNECTED | awk '{print $8}' | sort | uniq -c

Note, this compares to 55 processes on my desktop connected to the system bus.

Tony Espy (awe)
Changed in network-manager (Ubuntu):
assignee: Andrea Bernabei (faenil) → Tony Espy (awe)
Revision history for this message
Tony Espy (awe) wrote :

I managed to hack up wpa_supplicant such that it artificially adds a hard-coded set of non-existent APs every scan. It's not a pretty patch, but it served the purpose of ramping up the DBus traffic generated by NM significantly on each scan.

See: https://code.launchpad.net/~awe/wpasupplicant/add-fake-aps

I was able to get the UI ( the app switcher animation more specifically ) to freeze up for anywhere from 4-6s whenever a Wi-Fi scan occurred. I currently have the patched wpa_supplicant adding 125 APs to the list, which results in a total of 175 APs in the scan results seen in my loft.

I also looked a bit more closely at the touch-specific change which introduced NM's AccessPoint 'LastSeen' property. This was made in response to a private bug that was reported during the development cycle of the HERE code for the phone. The original complaint was about access points becoming stale in the scan results, resulting in skewed locations. wpa_supplicant's AccessPoint object exposes an 'age' property, but it's not exposed by NetworkManager. So a 'LastSeen' property was added to NM's AccessPoint object, and it gets updated by bss_updated_cb() in nm-device-wifi.c. It looks like for every scan_result, bss_updated signal is being generated for every AccessPoint in the scan results ( note, NM doesn't even look at which property changed... so it's hard to tell which property is triggering this on every scan; most likely it's Strength which is randomly assigned in my hack ).

The original HERE bug stated that it queries the available networks, so we'll need to check if it's truly reliant on these signals or not. It might be possible that addition of the property to query results is enough ).

In the meanwhile, I've built a test version of network-manager with the property-changed notifications for 'LastSeen' removed. It can be found in my PPA: https://launchpad.net/~awe/+archive/ubuntu/ppa

A quick test shows that this change makes the UI stuttering much less severe ( maybe <= 1s stutter ).

Note, NM is basically propagating the same set of signals as wpa_supplicant sends. In theory, this shouldn't be enough to saturate the bus. If it is, then we're really pushing the limits of the system.

One other theory is that we have applications and/or subsystems that are using overly broad DBus match rules. For more details on this, please see:

https://www.collabora.com/about-us/blog/2014/10/06/improving-the-security-of-d-bus/

It *might* be possible to run dbus-1.10 from wily, which includes some debug tools which would allow us to more easily examine the match rules currently in use on our images.

Revision history for this message
Tony Espy (awe) wrote :

A bit more info...

After a deeper look at some DBus traces on a vanilla krillin OTA6 image, I noticed a few more things.

After a ScanDone signal gets sent by NM, you typically see 3-5 PropertiesChanged signals for new APs that include all of the properties. Then, you see a PropertiesChanged from Device.Wireless which includes an array of the current APs. Then you usually see a PropertiesChanged signal for every AP except current ( AP=0 ), with the property 'LastSeen'. These signals all typically have the same value for last-seen, although if the list is long, there might be variation of maybe +1-2 across all the signals.

What's unusual, is I'm then seeing what looks like a second set of PropertiesChanged signals for all of the current APs.

Digging deeper in the trace, it turns out wpa_supplicant actually generates two PropertiesChanged signals every time an object changes. It first uses the new ( more correct ) 'org.freedesktop.DBus.Properties' interface, then sends the signal *again* using the deprecated 'fi.w1.wpa_supplicant1.Interface'! I think this is what in turn triggers the second signal for each AP from NM. I haven't proven this 100% yet, but it certainly explains the behavior of NM.

I'll continue to test.

I also had a discussion with one of our location service engineers, and we think that it might be possible to get rid of the 'LastSeen' PropertiesChanged signals altogether and just use the Device.Wireless 'AccessPoints' property, which gets updated after every scan completes.

Tony Espy (awe)
Changed in location-service (Ubuntu):
status: New → Incomplete
Changed in network-manager (Ubuntu):
status: Incomplete → In Progress
Tony Espy (awe)
no longer affects: indicator-network (Ubuntu RTM)
Changed in network-manager (Ubuntu RTM):
status: New → In Progress
assignee: nobody → Tony Espy (awe)
Changed in network-manager (Ubuntu):
importance: Undecided → High
Changed in network-manager (Ubuntu RTM):
importance: Undecided → High
tags: added: hotfix
Changed in canonical-devices-system-image:
status: Confirmed → In Progress
Tony Espy (awe)
Changed in network-manager (Ubuntu RTM):
status: In Progress → Fix Committed
Changed in network-manager (Ubuntu):
status: In Progress → Fix Released
Changed in canonical-devices-system-image:
status: In Progress → Fix Committed
Changed in network-manager (Ubuntu RTM):
status: Fix Committed → Fix Released
no longer affects: location-service (Ubuntu)
Changed in canonical-devices-system-image:
status: Fix Committed → Fix Released
Changed in canonical-devices-system-image:
milestone: ww40-2015 → ww46-2015
status: Fix Released → Confirmed
Changed in network-manager (Ubuntu):
status: Fix Released → Confirmed
Tony Espy (awe)
Changed in network-manager (Ubuntu RTM):
status: Fix Released → Incomplete
Changed in network-manager (Ubuntu):
status: Confirmed → Incomplete
Tony Espy (awe)
Changed in location-service (Ubuntu RTM):
assignee: nobody → Scott Sweeny (ssweeny)
importance: Undecided → High
Changed in canonical-devices-system-image:
milestone: ww46-2015 → ww02-2016
Tony Espy (awe)
no longer affects: buteo-syncfw (Ubuntu)
Changed in location-service (Ubuntu RTM):
status: New → Triaged
Changed in qtbase-opensource-src (Ubuntu):
status: New → Confirmed
affects: unity8 (Ubuntu RTM) → qtbase-opensource-src (Ubuntu)
Changed in qtbase-opensource-src (Ubuntu):
status: New → Confirmed
tags: added: patch
Changed in location-service (Ubuntu RTM):
assignee: Scott Sweeny (ssweeny) → Thomas Voß (thomas-voss)
status: Triaged → In Progress
Changed in dbus-cpp (Ubuntu):
assignee: nobody → Thomas Voß (thomas-voss)
importance: Undecided → High
status: New → In Progress
Tony Espy (awe)
Changed in qtbase-opensource-src (Ubuntu):
assignee: nobody → Tony Espy (awe)
importance: Undecided → High
status: Confirmed → In Progress
55 comments hidden view all 135 comments
Revision history for this message
Thomas Voß (thomas-voss) wrote : Re: [Bug 1480877] Re: Access points' "PropertiesChanged" dbus signals freeze UI on mobile devices

Quick update on silo 26: I'm rebuilding location-service to account
for a recent landing.
The one blocking issue is mediascanner2, which requires a rebuild,
too. It is however enabled for dual-landing
to both vivid and xenial, and we cannot easily do a vivid+o landing
with it right now. I'm working on the resolving that issue,
but it will likely take a day or two to complete.

On Mon, Nov 23, 2015 at 3:49 PM, Michael Terry
<email address hidden> wrote:
> I tested silo 26 + libqt5network5 from Tony's PPA and did my normal
> routine of walking out of range of my home network.
>
> It seems much better. dbus-daemon CPU usage is down (only ~30% for a
> few seconds after initial switch to 3G and ~15% during scans after).
>
> But more importantly, stuttering is much better. Only a couple tiny
> stutters on initial switch. And none during scans. I'm +100 on trying
> to land this.
>
> (My test was on a relatively-recently-rebooted phone, so maybe not the
> ideal test. I'll try again later after a while of uptime to double
> confirm the results, but this looks good. If I don't post again, assume
> it worked great.)
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1480877
>
> Title:
> Access points' "PropertiesChanged" dbus signals freeze UI on mobile
> devices
>
> Status in Canonical System Image:
> Confirmed
> Status in Unity 8:
> New
> Status in dbus-cpp package in Ubuntu:
> In Progress
> Status in indicator-network package in Ubuntu:
> Incomplete
> Status in network-manager package in Ubuntu:
> Incomplete
> Status in qtbase-opensource-src package in Ubuntu:
> In Progress
> Status in buteo-syncfw package in Ubuntu RTM:
> New
> Status in location-service package in Ubuntu RTM:
> In Progress
> Status in maliit-framework package in Ubuntu RTM:
> New
> Status in network-manager package in Ubuntu RTM:
> Incomplete
> Status in sync-monitor package in Ubuntu RTM:
> New
>
> Bug description:
> Krillin, rc-proposed, r83
>
>
> DESCRIPTION:
> I've been trying to track down the cause of the occasional UI freezes on my Krillin device, and I noticed that whenever the UI freezes for 2-4 seconds, I get a burst of "PropertiesChanged" signals in dbus-monitor
>
> Here's a log of what's shown in dbus-monitor:
> http://pastebin.ubuntu.com/11992322/
>
> I'd guess the problem is in the code that actually catches the signals
> and acts accordingly.
>
> HOW TO REPRODUCE:
> 1) Move to a place where many wifi hotspots are available
> 2) Connect the device via USB and run "phablet-shell" and then "dbus-monitor"
> 3) Use the device while keeping an eye on dbus-monitor output
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/canonical-devices-system-image/+bug/1480877/+subscriptions

Revision history for this message
Thomas Voß (thomas-voss) wrote :

Silo 26 is now complete.

On Mon, Nov 23, 2015 at 4:47 PM, Thomas Voß <email address hidden> wrote:
> Quick update on silo 26: I'm rebuilding location-service to account
> for a recent landing.
> The one blocking issue is mediascanner2, which requires a rebuild,
> too. It is however enabled for dual-landing
> to both vivid and xenial, and we cannot easily do a vivid+o landing
> with it right now. I'm working on the resolving that issue,
> but it will likely take a day or two to complete.
>
> On Mon, Nov 23, 2015 at 3:49 PM, Michael Terry
> <email address hidden> wrote:
>> I tested silo 26 + libqt5network5 from Tony's PPA and did my normal
>> routine of walking out of range of my home network.
>>
>> It seems much better. dbus-daemon CPU usage is down (only ~30% for a
>> few seconds after initial switch to 3G and ~15% during scans after).
>>
>> But more importantly, stuttering is much better. Only a couple tiny
>> stutters on initial switch. And none during scans. I'm +100 on trying
>> to land this.
>>
>> (My test was on a relatively-recently-rebooted phone, so maybe not the
>> ideal test. I'll try again later after a while of uptime to double
>> confirm the results, but this looks good. If I don't post again, assume
>> it worked great.)
>>
>> --
>> You received this bug notification because you are a bug assignee.
>> https://bugs.launchpad.net/bugs/1480877
>>
>> Title:
>> Access points' "PropertiesChanged" dbus signals freeze UI on mobile
>> devices
>>
>> Status in Canonical System Image:
>> Confirmed
>> Status in Unity 8:
>> New
>> Status in dbus-cpp package in Ubuntu:
>> In Progress
>> Status in indicator-network package in Ubuntu:
>> Incomplete
>> Status in network-manager package in Ubuntu:
>> Incomplete
>> Status in qtbase-opensource-src package in Ubuntu:
>> In Progress
>> Status in buteo-syncfw package in Ubuntu RTM:
>> New
>> Status in location-service package in Ubuntu RTM:
>> In Progress
>> Status in maliit-framework package in Ubuntu RTM:
>> New
>> Status in network-manager package in Ubuntu RTM:
>> Incomplete
>> Status in sync-monitor package in Ubuntu RTM:
>> New
>>
>> Bug description:
>> Krillin, rc-proposed, r83
>>
>>
>> DESCRIPTION:
>> I've been trying to track down the cause of the occasional UI freezes on my Krillin device, and I noticed that whenever the UI freezes for 2-4 seconds, I get a burst of "PropertiesChanged" signals in dbus-monitor
>>
>> Here's a log of what's shown in dbus-monitor:
>> http://pastebin.ubuntu.com/11992322/
>>
>> I'd guess the problem is in the code that actually catches the signals
>> and acts accordingly.
>>
>> HOW TO REPRODUCE:
>> 1) Move to a place where many wifi hotspots are available
>> 2) Connect the device via USB and run "phablet-shell" and then "dbus-monitor"
>> 3) Use the device while keeping an eye on dbus-monitor output
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/canonical-devices-system-image/+bug/1480877/+subscriptions

Revision history for this message
Tony Espy (awe) wrote :

@Lorn

Thanks for the updated patch.

The only significant difference besides the DeviceModem destructor fix was the change you mentioned about default routes. Is there a touch-specific bug that this fixes? On our system, the modem connection will never be IPv6, whereas it is possible to get a IPv6 connection for WiFi. Perhaps we keep this separate?

Revision history for this message
Lorn Potter (lorn-potter) wrote :

@Tony
I noticed this on touch.
The problem was that the ConnectionActive was returning true for mobile data ipv6 default route when wifi had the actual default route, so it would never get updated when wifi became default.

Revision history for this message
Tony Espy (awe) wrote :

@Lorn

Did you see this recently? If so, what device and image?

As I mentioned, the mobile connection on touch doesn't support IPv6 connections. There was a bug in NM that falsely reported that a default IPv6 route was available, but this was fixed awhile back. Please see bug #1444162 for details. Perhaps this is what you were seeing?

Revision history for this message
Michael Terry (mterry) wrote :

Here's a followup on living with silo 26 (pre-comment 97, so an earlier version, but I doubt it matters for testing?) and Tony's libqt5network5.

I tested this morning (so a full day of the phone living on this code) and it was slightly worse -- small stutters instead of tiny ones. But still a vast improvement on what it was before these changes. I assume the degradation was due to other actors on the phone leaking signal watches. Or just random chance.

But I still love the changes.

Revision history for this message
Tony Espy (awe) wrote :

@Mike

Small stutters when transitioning from mobile to WiFi or visa versa, or stutters when scanning. The fixes to location-services and Qt both are applicable to the latter case only...

Revision history for this message
Michael Terry (mterry) wrote :

@Tony, small stutters when transitioning from wifi to mobile. I don't see stutters when scanning anymore. And the stutters I do see when transitioning from wifi to mobile seem better (presumably because less dbus signal watchers are registered now for all the traffic that happens on a switch?).

Revision history for this message
Tony Espy (awe) wrote :

Here's my updated patch, the only difference is that it includes a DBusConnection disconnect() call in the destructor for QNetworkManagerInterfaceDeviceModem(), which I'd punted on for expediency while testing earlier this week.

If doesn't include the IPv6 related fix, as this was fixed in NM already.

Our plan is to get this into silo-026 ASAP for testing as a potential hotfix along with the location-services match rules fixes.

Revision history for this message
Lorn Potter (lorn-potter) wrote :

I changed my phones image channel, tested (with a few specific test apps I have for QNAM & friends) that new patch and didn't see any thing too wrong.

Although I am seeing heaps of GetAll calls that do not need to happen (even for non nm managed interfaces - rmnet). Attached patch simplifies/optimizes more. Reduces the number of blocking calls from about 50 to ~4 when default route changes.

Revision history for this message
Tony Espy (awe) wrote :

@Lorn

Can you add some details regarding what image channel you actually used and device for testing? rc-proposed is probably the best base for you to use.

The "GetAll" behavior you're describing is similar to what I was seeing with the AccessPoint objects. The code would see a "PropertiesChanged" signal for an access point, and directly issue a blocking "GetAll" call. This should only be done at initial object creation. Once initialized, the properties can be monitored directly via "PropertiesChanged" signals and the included payload. I imagine this is probably what you've done with your patch.

I'll review your updates, and confer with some others to see if we defer your additional changes to OTA9, or try to get them in as part of a proposed hot-fix.

Thanks again for your help!

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

The version 4 of the patch (from comment #104) is in https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/landing-026 at the moment.

Revision history for this message
Lorn Potter (lorn-potter) wrote :

@Tony, currently I am on stable/ubuntu-developer. Previously I was testing on stable/ubuntu. I was on rc-proposed for quite a while, but it seemed to have only a few scopes, and not the twitter scope which I wanted to try out.

My updated patch removes actions on device added and removed calls. Most of these were rmnet related (which appears to not be managed by nm anyway), and relies on active connection and settings changes.
I haven't tested it in regards to hotspot or bluetooth tethering.

Revision history for this message
Tony Espy (awe) wrote :

@Lorn

Thanks for the update.

I usually do all my testing on rc-proposed, as it's latest and greatest. I wasn't aware of the "ubuntu-developer" channel. I've tested on rc-proposed/bq-aquaris.en ( #188 ) with the full set of packages from silo-026, and everything seemed good to me.

Thar said, I've reviewed your latest changes and they look good to me. Not creating QNetworkManagerInterfaceDevice and DeviceWireless instances definitely gets rid of the a bunch more DBus signal match rules, and unneeded DBus calls. DeviceWireless is still generates PropertiesChanged signals every time a scan finishes...

With this code removed, the plugin operates soley on active and system connection updates.

Regarding the device adds for 'rmnet' devices on mako, in theory these should've only been happening at start-up time and/or whenever flight-mode is turned off.

Anyways, I'm +1 for including your latest revision of the patch. Thanks again for your help!

Revision history for this message
Tony Espy (awe) wrote :

Note, I've also pushed a new version based on your latest patch to my PPA if anyone's interested in testing before it lands in silo-026:

https://launchpad.net/~awe/+archive/ubuntu/ppa

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Ver 5 in 026 now too (still building), together with the gles counterpart packagae.

Revision history for this message
Lorn Potter (lorn-potter) wrote :

@Tony I suppose that if statement in QNetworkManagerEngine::requestUpdate() could also be removed, leaving the last line only, since we do not do anything with uknown AP's

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Not related to the bug at hand directly, but applying the signals5.patch on top of Qt 5.5.1 caused a crash to always happen on Unity8 startup:
/usr/bin/unity8:11:/usr/lib/arm-linux-gnueabihf/qt5/plugins/bearer/libqnmbearer.so+7c7c:/usr/lib/arm-linux-gnueabihf/libQt5Core.so.5.5.1+1e4f5a:[stack..3416]+7fd8b8

So my suggestion is to ship whatever needed for 5.4 quickly but have the longer term goal of shipping connectivity-api bearer so that the 5.5/xenial will wait for that (unless someone wants to look at the 5.5 specific problem with this current patch).

Revision history for this message
Tony Espy (awe) wrote :

@Timo

As previously mentioned, I'd like to get this change out as a hot-fix, and then do proper evaluation of the new connectivity API based plugin with the goal to land it as part of OTA9.

@Lorn

I'll take a look at that the line you mentioned, however we need to draw the line at some point and re-focus on the replacement plugin as mentioned above.

Changed in canonical-devices-system-image:
milestone: ww02-2016 → ww50-2015
Revision history for this message
Tony Espy (awe) wrote :

The latest version of the patch is now in a silo-026, included as part of a new qtbase version 5.4.1+dfsg-2ubuntu11~vivid1.

Changed in qtbase-opensource-src (Ubuntu):
status: In Progress → Fix Committed
Changed in sync-monitor (Ubuntu RTM):
status: New → Incomplete
Revision history for this message
Tony Espy (awe) wrote :

The fix has landed in silo-026, as part of location-services version: 2.1+15.04.20151202.1-0ubuntu1

Changed in location-service (Ubuntu RTM):
status: In Progress → Fix Committed
no longer affects: sync-monitor (Ubuntu RTM)
no longer affects: maliit-framework (Ubuntu RTM)
no longer affects: indicator-network (Ubuntu)
no longer affects: buteo-syncfw (Ubuntu RTM)
Changed in dbus-cpp (Ubuntu RTM):
status: New → Fix Committed
Revision history for this message
Tony Espy (awe) wrote :

The associated version for the dbus-cpp in silo-026 is: 4.3.0+15.04.20151126-0ubuntu1

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

This bug was fixed in the package location-service 2.1+15.04.20151202.1-0ubuntu1 in https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/stable-phone-overlay

---------------

location-service (2.1+15.04.20151202.1-0ubuntu1) vivid; urgency=medium

  * Ensure that event connections are cleaned up on destruction. (LP:
    #1480877)

 -- Thomas Voß <email address hidden> Wed, 02 Dec 2015 12:12:21 +0000

Changed in location-service (Ubuntu RTM):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

This bug was fixed in the package dbus-cpp 4.3.0+15.04.20151126-0ubuntu1 in https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/stable-phone-overlay

---------------

dbus-cpp (4.3.0+15.04.20151126-0ubuntu1) vivid; urgency=medium

  [ CI Train Bot ]
  * New rebuild forced.

  [ Thomas Voß ]
  * Ensure that Signal with non-void argument types correctly narrow
    their match rules. (LP: #1480877)

 -- Thomas Voß <email address hidden> Thu, 26 Nov 2015 07:31:37 +0000

Changed in dbus-cpp (Ubuntu RTM):
status: Fix Committed → Fix Released
Changed in canonical-devices-system-image:
status: Confirmed → Fix Committed
Revision history for this message
Tony Espy (awe) wrote :

Just a quick update to mention that the change to the NM bearer plugin seems to have caused a regression when qtbase is installed from the PPA onto a desktop ( for UI development work ).

To reproduce, the overlay PPA needs to be added as a software source, then ubuntu-system-settings installed/upgraded ( which pulls in qtbase ). See bug #1523975 for details.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

Is it possible that this fix might cause a phone to wake up more often, or to stay awake longer? I'm seeing weird idle power behavior ever since this fix landed in krillin rc-proposed 196.

Before: ~9 mA
http://people.canonical.com/~platform-qa/power-results/2015-12-08_03:19:30-krillin-195-power_usage_idle/graph.png

After: ~35 mA
http://people.canonical.com/~platform-qa/power-results/2015-12-08_07:36:16-krillin-196-power_usage_idle/graph.png

Here's the commit log for 196:

http://people.canonical.com/~lzemczak/landing-team/ubuntu-touch/rc-proposed/196.commitlog

Arale results look fine. I don't know why krillin is different, but I'm only seeing this behavior on krillin when wifi is connected... and this bug fix is a likely suspect. I don't see additional forks or crashes in the test logs, which suggests the issue is probably caused by a long-running process.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

I created bug 1524133 to track the issues I'm seeing. I don't know if it's related to this bug, but so far it looks probable.

Revision history for this message
Lorn Potter (lorn-potter) wrote :

patch #6 fixes crash in system-settings (#1523975) and removes wifi scanning updates, which could potentially drain the battery (#1524133)

Revision history for this message
Thomas Voß (thomas-voss) wrote :

Thanks for the graphs. One thing that stands out is the time that the
system stays active when occasionally waking up from deep sleep (right
hand side of the graph).
Do you have the raw data producing those graphs handy?

Thanks,

  Thomas

On Wed, Dec 9, 2015 at 1:23 AM, Selene Scriven
<email address hidden> wrote:
> Is it possible that this fix might cause a phone to wake up more often,
> or to stay awake longer? I'm seeing weird idle power behavior ever
> since this fix landed in krillin rc-proposed 196.
>
> Before: ~9 mA
> http://people.canonical.com/~platform-qa/power-results/2015-12-08_03:19:30-krillin-195-power_usage_idle/graph.png
>
> After: ~35 mA
> http://people.canonical.com/~platform-qa/power-results/2015-12-08_07:36:16-krillin-196-power_usage_idle/graph.png
>
> Here's the commit log for 196:
>
> http://people.canonical.com/~lzemczak/landing-team/ubuntu-touch/rc-
> proposed/196.commitlog
>
> Arale results look fine. I don't know why krillin is different, but I'm
> only seeing this behavior on krillin when wifi is connected... and this
> bug fix is a likely suspect. I don't see additional forks or crashes in
> the test logs, which suggests the issue is probably caused by a long-
> running process.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1480877
>
> Title:
> Access points' "PropertiesChanged" dbus signals freeze UI on mobile
> devices
>
> Status in Canonical System Image:
> Fix Committed
> Status in Unity 8:
> New
> Status in dbus-cpp package in Ubuntu:
> In Progress
> Status in network-manager package in Ubuntu:
> Incomplete
> Status in qtbase-opensource-src package in Ubuntu:
> Fix Committed
> Status in dbus-cpp package in Ubuntu RTM:
> Fix Released
> Status in location-service package in Ubuntu RTM:
> Fix Released
> Status in network-manager package in Ubuntu RTM:
> Incomplete
>
> Bug description:
> Krillin, rc-proposed, r83
>
>
> DESCRIPTION:
> I've been trying to track down the cause of the occasional UI freezes on my Krillin device, and I noticed that whenever the UI freezes for 2-4 seconds, I get a burst of "PropertiesChanged" signals in dbus-monitor
>
> Here's a log of what's shown in dbus-monitor:
> http://pastebin.ubuntu.com/11992322/
>
> I'd guess the problem is in the code that actually catches the signals
> and acts accordingly.
>
> HOW TO REPRODUCE:
> 1) Move to a place where many wifi hotspots are available
> 2) Connect the device via USB and run "phablet-shell" and then "dbus-monitor"
> 3) Use the device while keeping an eye on dbus-monitor output
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/canonical-devices-system-image/+bug/1480877/+subscriptions

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

qtbase 5.4.1+dfsg-2ubuntu11~vivid2 is ready for testing at https://requests.ci-train.ubuntu.com/#/ticket/761 with Lorn's net-bearer-nm-disconnect-ap-signals6.patch from bug #1480877 that includes a fix for both u-s-s crasher bug #1523975 and a potential fix to too many wifi scanning updates bug #1524133.

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

The interdiff from signals5 to signals6: http://paste.ubuntu.com/13858649/ - removal of code and one-liner to fix the system-settings.

Revision history for this message
Tony Espy (awe) wrote :

So the latest changes in Lorn's signals6 patch can be summarized as:

1) remove QNetworkManagerEngine::requestUpdate (), which is a public method which can trigger a WiFi scan request to NM.

This method doesn't appear to be called internally by the bearer plugin, so if we think extra WiFi scans are the culprit, there must have been an change to one of our Qt applications which is calling this method. Also as this method wasn't added in the original patch, it's presence can't really be considered a regression. Has anyone attempted to determine if we *are* scanning more often? This could be done manually by running wpa_cli ( as root ), and watching for the frequency of scan events output. It can also be accomplished by looking at the wpa_suppl and/or NM log messages in syslog. And finally, you could also just monitor DBus looking for ScanDone signals from NM.

NOTE - I'm pretty sure this patch won't compile unless this method is also removed from qnetworkmanagerengine.h too, but I haven't tried to build it myself...

 2) check that wiredDevice pointer is valid before using it to call a method

This is the crasher fix when USS/Qt from the PPA gets installed on a desktop. Looks fine to me.

 3) remove most of the logic from the engine's parseConnection() method. This method takes a connection path, and creates a private configuration object, and then based upon the underlying device type, may modify the private configuration instance in a device-specific way before returning it to the caller. The device-specific logic in some cases could have side-effects, such as modifying the global accessPointConfigurations hash table and/or the configuredAccessPoint map.

Again, this chnage looks reasonable to remove, however I don't see how it could have any impact on power. This code only runs during initialization where all existing system connections are loaded, and whenever a new system connection is added ( ie. a user connects to a new AP or APN ).

I'm not sure whether we want to include this the last change if we want to keep the delta as small as possible?

Revision history for this message
Lorn Potter (lorn-potter) wrote :

@Tony
1) requestUpdate() should be left in, as the backend needs to tell other parts when the update request has been completed. I have tested this, so it does compile. In further testing, it does not look like any clients are requesting updates, so the increased power consumption is coming from somewhere else.

2) The crash is because of dead code, and included change is just a quick fix. Proper fix is to go through this and remove the rest of the dead and unneeded code.

3) If you want this delta as small as possible, it is safe to remove this from the patch and only include #2. There is more dead code in there now that it only relies on settings and active connections, which I will remove in a different patch.

Revision history for this message
Lorn Potter (lorn-potter) wrote :

fyi: requestUpdate is called by the qnetworkconfigurationmanager, when updateConfigurations() gets called.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

tvoss: The raw data is available. Simply remove "graph.png" from the URL and it'll give you a directory with all sorts of logs.

The green-shaded section is the active measurement period used for statistics. The red-shaded portions indicate where USB was connected. Everything else is unplugged.

Revision history for this message
Selene ToyKeeper (toykeeper) wrote :

FWIW, I see the weird new behavior on every test with wifi enabled, even when the screen is on and the phone is actively doing something. For example, while playing music:

http://people.canonical.com/~platform-qa/power-results/2015-12-08_09:19:36-krillin-196-power_usage_music/graph.png

... versus how it looked before:

http://people.canonical.com/~platform-qa/power-results/2015-12-08_04:50:02-krillin-195-power_usage_music/graph.png

But in flight mode, both graphs are a flat line barely above zero:

http://people.canonical.com/~platform-qa/power-results/2015-12-08_11:07:01-krillin-196-power_usage_flight_mode_on/graph.png

So, it's not just while sleeping; it's any time wifi is enabled.

Revision history for this message
Tony Espy (awe) wrote :

@Selene

Just wanted to confirm that we don't need any more investigation re: your comment #131 ( I think that may be the largest comment # I've seen in a LP bug ), as your latest update to bug #1524133 indicated that you'd isolated the problem to a SIM with an expired data plan?

@Timo

We should probably close this bug out when OTA8.5 is released, and then push another update to the PPA that contains the updated patch with the desktop crash fix ( bug #1523975 ). I've attached yet another of the version of the patch which just includes the additional wiredDevice NULL check from Lorn's latest patch.

@Lorn

Thanks again for all the help. Again, once the OTA8.5 update has been released, we can focus on transitioning to the connectivity API based bearer plugin for our next update.

Revision history for this message
Timo Jyrinki (timo-jyrinki) wrote :

Patch signals7 (with just the system settings crash fixed) building now in the PPA.

Changed in canonical-devices-system-image:
status: Fix Committed → Fix Released
Tony Espy (awe)
Changed in qtbase-opensource-src (Ubuntu):
status: Fix Committed → Fix Released
status: Fix Released → Fix Committed
Changed in qtbase-opensource-src (Ubuntu RTM):
status: New → Fix Released
assignee: nobody → Tony Espy (awe)
importance: Undecided → High
Changed in qtbase-opensource-src (Ubuntu):
status: Fix Committed → Triaged
Changed in dbus-cpp (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qtbase-opensource-src - 5.5.1+dfsg-13ubuntu2

---------------
qtbase-opensource-src (5.5.1+dfsg-13ubuntu2) xenial; urgency=medium

  * Forward-port networking fixes from 5.4 series:
    - net-bearer-nm-disconnect-ap-signals7.patch (LP: #1480877)
    - qnam-ubuntu-fix6.patch (LP: #1528886)
    - xenial would potentially now have fixes for (LP: #1506015)
      (LP: #1507769) (LP: #1533508)

 -- Timo Jyrinki <email address hidden> Tue, 09 Feb 2016 08:19:43 +0000

Changed in qtbase-opensource-src (Ubuntu):
status: Triaged → Fix Released
Aron Xu (happyaron)
tags: added: nm-touch
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Michał Sawicz (saviq)
affects: unity8 → unity8 (Ubuntu)
Changed in unity8 (Ubuntu):
status: New → Confirmed
Changed in unity8 (Ubuntu):
status: New → Confirmed
Displaying first 40 and last 40 comments. View all 135 comments or add a comment.