Checking for updates never finishes

Bug #1528886 reported by Pat McGowan on 2015-12-23
64
This bug affects 12 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Critical
John McAleely
Today Scope
Undecided
Unassigned
connectivity-api (Ubuntu)
Undecided
Pete Woods
qtbase-opensource-src (Ubuntu)
Critical
Lorn Potter
qtbase-opensource-src (Ubuntu RTM)
Undecided
Unassigned
ubuntu-system-settings (Ubuntu)
Critical
Ken VanDine

Bug Description

Tested on krillin with SIM and data enabled
OTA 8.5 stable channel
Also Arale on proposed 201

Ensure phone has latest updates installed
Check for update on Wifi and it reports software is up to date
Turn off wifi and go to updates panel again
Note that the indicator shows and active 3G data connection
Checking for updates ... is shown and I see one of 3 results

In all cases the log reports:

2015-12-23 15:34:19,394 - WARNING - void UpdatePlugin::Network::checkForNewVersions(QHash<QString, UpdatePlugin::Update*>&)

2015-12-23 15:44:56,007 - WARNING - Reply is not valid.

In some instances I see this but not all, this is logged when the phone locks and resumes
2015-12-23 15:36:18,625 - WARNING - QObject::killTimer: Timers cannot be stopped from another thread
2015-12-23 15:36:18,625 - WARNING - QObject::startTimer: Timers cannot be started from another thread

One time I saw
Connect to the Internet ...

In anther case it displayed this after around 5 mins when I did not allow the phone to turn off:
Software is up to date

In all other cases Checking for updates never finished

description: updated
Pat McGowan (pat-mcgowan) wrote :

I got a call on the arale after round 10 mins, andupdate checking worked but the krillin still does not (I called it as well)

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu-system-settings (Ubuntu):
status: New → Confirmed
SB (emehntehtt) wrote :

BQ Aquaris E4.5 on OTA-8.5, this is constantly happening on several different wifi networks, on very very rare ocassions the check is successful, but otherwise it fails constantly.

Tony Espy (awe) wrote :

I flashed my krillin with stable #23, which I'm pretty sure equates to OTA8. I tested with an active US T-Mobile SIM ( Edge only ).

I'm not able to reproduce the bug as described.

Note, I first disabled auto-updates to ensure they weren't getting in the way.

The only peculiar behavior I observed was that the Updates UI would always display the Ubuntu update ( r24 ) as available, and then would display the click updates after some amount of time passed ( usually 5-10s, but sometimes as long as 30s to 1m ). Sometimes though, the click updates would never be listed. This seems to equate to a situation where NM shows that an active connection exists, but the network isn't actually reachable, or DNS isn't behaving correctly ( see related bug #1270189 ).

I only saw "Connect to the Internet" once, and never saw the UI just hang as described.

SB (emehntehtt) wrote :

@Tony Espy-UI doesnt hang for me, it just continually checks for updates until it gives back an error. And you say OTA-8, it was fine with OTA-8, this issue started with OTA-8.5. At first I thought update servers are having a problem, but the issue persists.

Tony Espy (awe) on 2015-12-23
Changed in ubuntu-system-settings (Ubuntu RTM):
status: New → Confirmed
Tony Espy (awe) wrote :

Please disregard comment #4, as I didn't correctly flash from the right channel. More test results pending...

Tony Espy (awe) wrote :

So basically what I'm seeing is that system-settings doesn't recognize the change of networking technology, and eventually times out after 4-5m, at which point the "Connect to the Internet" message is displayed. I have yet to see the spinner go forever as Pat suggests. I did however hit a crash in system-settings after the fourth or fifth time I ran my test scenario. I'll paste the backtrace in a subsequent comment.

I tested using a US AT&T SIM, which means I'm limited to 2G only on my krillin. I also disabled automatic updates on both phones before testing.

Here's my exact scenario, which I ran 5 times on krillin ( OTA8 + OTA8.5 ), and arale ( OTA8.5 ):

1. Boot the phone w/WiFi enabled & a previously connected access point available; after booting, verify the phone is connected to WiFi.

2. Launch SystemSettings::Updates

( Note, unlike Pat... I didn't install all available updates. So for OTA8.5 testing run today, 4 click updates are shown as available. For OTA8, an Ubuntu update is visible, along with 8 click updates. )

3. Return to main settings page

4. Disable WiFi, and *wait* for the network indicator to change from WiFi to 'E' ( on Krillin ) or 'H' ( on arale ).

5. Re-open Updates

This scenario results in three different results, partially depending on the speed of my network connection:

 * updates were shown promptly ( < 30s )
 * the spinner appeared, and updates were eventually shown after some delay ( 30s - 4m ); sometimes when this happened, only the Ubuntu update would be shown ( ie. I didn't see the available click updates )
 * the spinner appeared, and somewhere between 4-5m, the "Connect to Internet" message would be displayed

I basically saw little difference between OTA8 and OTA8.5 in my testing on krillin. With OTA8.5, I hit the "Connect..." scenario 3/5 tries. On OTA8, I hit it 2/5 times. On arale running OTA8.5, I hit it 2/5 as well.

Note, during each test I always verified that the network connection was good by running the following two commands:

$ nmcli d
$ ping ubuntu.com

From my point of view, Pat's original scenario doesn't seem to be a regression introduced by OTA8.5.

Tony Espy (awe) wrote :

I also mentioned a crash above. This occurred after my fourth or fifth try on krillin/OTA8.5. When I clicked the back button to return to the main settings page, the page was rendered with stretched icons, then it crashed and I had to relaunch.

The backtrace from the crashfile looks like this:

== Stacktrace =================================
#0 0x00000000 in ?? ()
No symbol table info available.
#1 0xabd3aa46 in ?? () from /usr/lib/arm-linux-gnueabihf/ubuntu-system-settings
/libupdate-plugin.so
No symbol table info available.
#2 0xb6dc2ad2 in QMetaObject::activate(QObject*, int, int, void**) () from /usr
/lib/arm-linux-gnueabihf/libQt5Core.so.5
No symbol table info available.
#3 0xb5d13a9c in QNetworkConfigurationManager::onlineStateChanged(bool) () from
 /usr/lib/arm-linux-gnueabihf/libQt5Network.so.5
No symbol table info available.
#4 0xb5d13ef6 in ?? () from /usr/lib/arm-linux-gnueabihf/libQt5Network.so.5
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@Pat

You mention errors in logs, however I didn't see any messages in syslog. Which log file were you looking at?

Tony Espy (awe) wrote :

@Cerebus

Regarding your comment #3, were you switching back and forth from mobile to WiFi when you saw problems?

The one relevant bit of information in comment #4 is that bug #1270189 may be involved when switching from the mobile connection to a WiFi connection. Did you verify that you actually had a working WiFi connection ( ie. via the browser, ping, ... )?

SB (emehntehtt) wrote :

Switching ocassionally happened, but as a rule it almost never works regardless to what network I am connected and whether its wifi or 3g, I didnt notice switching between wifi and 3g changed anything, problem persists in all scenarios. And yes browser and ping work when checking for updates doesnt, I thought it was my home Internet acting weird but I tested it on two other wifi networks with the same results, checking for updates doesnt work in 99.9% of ocassions. Today I managed to get it through once.

Lorn Potter (lorn-potter) wrote :

I hate to say it, but this sounds like a bug in qnam/qtbearer I have been chasing where the default configuration does not get switched back from wifi to an already/still connected mobile data connection.

Perhaps
https://bugreports.qt.io/browse/QTBUG-49760
or
https://bugreports.qt.io/browse/QTBUG-49581
perhaps?

royden (ryts) wrote :

My current situation Arale r202:

Tested working wifi - update checker does not complete in normal time.

Switch off wifi, enable mobile data 3G available, update checker completes.

Switch off mobile data, turn on wifi (working), checker does not complete normally.

SB (emehntehtt) wrote :

For some reason checking for updates works for me since yesterday, I will monitor the situation.

Marcos Lans (markooss) wrote :

For days now I had that bug on my BQ Ubuntu. Suddenly today update feature seems to work again.

JoiHap (astronomy) wrote :

Same problem here on a BQ Aquaris 4.5 after OTA 8.5 update, connected through WiFi or 3G, no updates whatsoever. Additionally, sometimes the system settings crashes on hitting "check for updates".

We also have reproduced this behaviour here in BQ, details below:

Scenario 1 Time 12:04

Krillin: rc-211 --> On Wi-Fi was able to detect update (rc215), on 3G wasn't able to see rc215
Vegeta OTA8.5 --> On wifi was able to detect updates, on 3G too

After the updates available

Scenario 2 Time 12:13

Krillin: rc-215 --> On Wi-Fi says "Connect to the internet to check for updates" after few minutes, on 3G took around 12 min to say "SW up to date"
Vegeta OTA8.5 --> On Wi-Fi says "Connect to the internet to check for updates" after few minutes, on 3G took around 12 min to say "SW up to date"

Logs attached from both devices (hope they are useful)

Please note that if you want to detect updates through 3G you should select "on any data connection" in auto download options

royden (ryts) wrote :

Given that "download" appears as the option, if you are correct,
that is a rather poor choice of word. I assume that
"detect an update" and "download" are different. Am I wrong,
because I believe that I detect them on 3G even though I do
not want to auto-download them?

Seasons's greetings to all

Pat McGowan (pat-mcgowan) wrote :

Possibly related to the issue where the network status is inconsistent

Changed in canonical-devices-system-image:
importance: Undecided → High
milestone: none → backlog
status: New → Confirmed
Pat McGowan (pat-mcgowan) wrote :

Adding a Qt tasks for the network bearer and a connectivity api task as we don't know where in the stack we are getting confused

Changed in qtbase-opensource-src (Ubuntu):
assignee: nobody → Lorn Potter (lorn-potter)
Changed in connectivity-api (Ubuntu):
assignee: nobody → Pete Woods (pete-woods)
Changed in ubuntu-system-settings (Ubuntu RTM):
assignee: nobody → Ken VanDine (ken-vandine)
Changed in canonical-devices-system-image:
assignee: nobody → Pat McGowan (pat-mcgowan)
milestone: backlog → ww08-2016
Changed in canonical-devices-system-image:
assignee: Pat McGowan (pat-mcgowan) → Bill Filler (bfiller)
Timo Jyrinki (timo-jyrinki) wrote :

There's the old silo 032 still which I just updated before the holidays to be up-to-date. With the recent network manager bearer OTA 8.5 fix it's no longer obviously blocked (bug #1508945 got fixed as a side effect), but it's still not a patch set I'm comfortable considering landing before it's more widely tested because it obviously did used to cause regressions in our case, even though via the route of apparmor blocking DBus Network Manager calls.

Included network related patches (or Lorn's from upstream):
  * debian/patches/Make-sure-to-report-correct-NetworkAccessibility.patch
  * debian/patches/Make-sure-networkAccessibilityChanged-is-emitted.patch
  * debian/patches/Fix-hang-in-qnam-when-disconnecting.patch
  * debian/patches/Make-UnknownAccessibility-not-block-requests.patch

(plus changing the generic plugin skipping to be done via env var as merged in upstream)

The other bugs I've figured might be related are bug #1507769 and bug #1506015.

If there's any doubt, switching to the connectivity-api bearer might be better to do first.

Timo Jyrinki (timo-jyrinki) wrote :

(s/or Lorn's/all Lorn's)

The diff to current overlay is at: https://ci-train.ubuntu.com/job/ubuntu-landing-032-1-build/lastSuccessfulBuild/artifact/qtbase-opensource-src_vivid_content.diff + the ubuntu-touch-session in the PPA that adds the environment variable.

Lorn Potter (lorn-potter) wrote :

I tested landing-032 and system-settings at least does not hang at 'checking for updates' when on mobile data (it now checks and when it completes says 'connect to the internet to check for updates')

landing-032 contains needed fixes in QtNetworkAccessManager (including this bug), and would be present with any bearer plugin without them.

Lorn Potter (lorn-potter) wrote :

This seems to be working now as expected (usinglanding-032) without the 'connect to the internet to check for updates'

Lorn Potter (lorn-potter) wrote :

Seems this did not totally fix this, it will still hang at 'checking' if I switch off wifi and then quickly check for updates, but will work if I press back and then check again.

Lorn Potter (lorn-potter) wrote :

ok. I thought it was fixed, but as I have investigated it further in qnetworkaccessmanager, I was somehow fooled and that landing does not fix this.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in connectivity-api (Ubuntu):
status: New → Confirmed
Changed in qtbase-opensource-src (Ubuntu):
status: New → Confirmed
Lorn Potter (lorn-potter) wrote :

I believe I have fixed this against landing-032, patch attached

It needs more testing, including upstream autotests.

The attachment "qnam-ubuntu-fix.diff" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Timo Jyrinki (timo-jyrinki) wrote :

Updated to silo 032. Unit tests pass, although the builders run without network manager so it's using the generic bearer.

Tony Espy (awe) wrote :

@Lorn

What was broken in QNetworkAccessManager?

Lorn Potter (lorn-potter) wrote :

1) QNAM was only being signalled when the online state of nm changed. Since mobile data remains online when wifi connects, the online state never changes when the default route moves to wifi. Thus QNAM/QNetworkRequest was not following the correct configuration/session. The session would never signal the connected state (because it was already connected and didn't change) This means that the request would never migrate/resume to new connection, and would wait for proper session state to actually start the operation, which would never come.

Haven't looked into why it would not timeout in a reasonable amount of time.

2) Sometimes QNAM would get confused by #1 and set itself to NotAccessible, thus blocking the request and issuing error 99 (UnknownNetworkError)

Lorn Potter (lorn-potter) wrote :

seems #2 in comment #34 still is not fixed.

Lorn Potter (lorn-potter) wrote :

2nd patch here that I am fairly certain fixes this in QNAM as well as in the nm bearer plugin (making the defaultConfiguration switch when default connection actually changes). It also optimizes the bearer plugin a bit.

I have an auto test to be run on a phone to test for this, but I haven't put it anywhere just yet.

Timo Jyrinki (timo-jyrinki) wrote :

I replaced the former patch with the fix2, ~test3 is building in silo 032 with that included.

The only minor changes to the patch: fixing paths, removing the removed "qDebug()" line from the last hunk since it's not in the original code.

Lorn Potter (lorn-potter) wrote :

Just tested fix2 package, does not seem to be working entirely correct yet. Maybe I missed something in that patch.
It seems to report isOnline is false, quickly followed by true, which seems to confuse things...

Lorn Potter (lorn-potter) wrote :

alright, attached is yet another fixup. I used quilt this time, so it should be about good to go.

The only thing I have not really tested is when only connected to mobile data, and the connection is bad or weak, and it keeps disconnecting and disconnects in the middle of an update request.
It seems when I want to test it this way, I will have good reception. I will keep trying and test it when the opportunity arrives.

Timo Jyrinki (timo-jyrinki) wrote :

fix3 patch will be available as ~test4 in silo 032 for vivid-overlay / Qt 5.4 (and independently as ~test3 in silo 057 for xenial / Qt 5.5)

Lorn Potter (lorn-potter) wrote :

grrr... At least I can reliably reproduce this:

Start with wifi off and system settings open.
have adb shell or ssh into device.

check for updates
quickly run the command: nmcli radio wifi on

Update should either complete or have an error and should not get stuck at "checking for updates..."

Bill Filler (bfiller) on 2016-01-31
Changed in canonical-devices-system-image:
importance: High → Critical
Changed in ubuntu-system-settings (Ubuntu RTM):
importance: Undecided → Critical
Changed in qtbase-opensource-src (Ubuntu):
importance: Undecided → Critical
status: Confirmed → In Progress
Changed in ubuntu-system-settings (Ubuntu):
importance: Undecided → Critical
assignee: nobody → Ken VanDine (ken-vandine)
no longer affects: ubuntu-system-settings (Ubuntu RTM)
Changed in canonical-devices-system-image:
assignee: Bill Filler (bfiller) → John McAleely (john.mcaleely)
Bill Filler (bfiller) on 2016-01-31
Changed in canonical-devices-system-image:
status: Confirmed → In Progress
Lorn Potter (lorn-potter) wrote :

One more time...
This one works better in that is doesn't sit and spin. It will at least say "connect to the internet" when the connection changes.

Timo Jyrinki (timo-jyrinki) wrote :

Silos 032 (vivid) and 057 (xenial) updated with the patch. I will later join them together to 032 since ubuntu-touch-session was just converted into dual landing.

tags: added: connectivity
Lorn Potter (lorn-potter) wrote :

Still working for me!

How to test this:

Open system settings and click on Updates and then quickly switch wifi off or on

Timo Jyrinki (timo-jyrinki) wrote :

I think we should now gather more tests from others since the networking fixes are plenty and it'd be useful to know if other bugs are properly fixed now too, and especially that there would be no regressions.

I'll ping on the other possibly related bugs to get more test coverage. If we have the fix and find no regressions, QA can start evaluating the silo.

Timo Jyrinki (timo-jyrinki) wrote :

As I started testing on my main daily phone, I noticed that there is trouble with the Today scope when using the silo 32. It has problems obtaining data / refreshing itself. It seems reproduable, but it's also the only scope that seems to have problems. Maybe it's doing something peculiar workarounding the existing networking problems?

Such a regression would of course not be acceptable. Reproducing of the problem would be welcome.

Lorn Potter (lorn-potter) wrote :

The Euronews scope also seems to have network issues, so maybe the scope scope-aggregator has issues with this. I'll dig into that code to see if I can find something. At least a way to test this condition.

Kyle Nitzsche (knitzsche) wrote :

HI Lorn, just noting that scopes *generally receive the network connectivity status as an enum passed to them from the unity-scopes-shell as metadata. The only thing that scope-aggregator does with the network status is (depending on the scope config), it displays a "Hey, no network" type message. Beyond this, responding reasonably to no-network conditions is up to child scopes.

Not sure it is relevant, but I have found an issue with this enum: https://bugs.launchpad.net/ubuntu/+source/unity-scopes-api/+bug/1502282/comments/24

* I say "generally" because scopes *can* take additional steps.

Lorn Potter (lorn-potter) wrote :

Alrighty then! patch #6
This seems to work with Today, euronews scopes, and doesn't hang!

Timo Jyrinki (timo-jyrinki) wrote :

armhf build for vivid just finished in silo 32, version 5.4.1+dfsg-2ubuntu11~vivid4~test6.

gles and xenial packages updated too.

Testing welcome again, and I'll install it on my daily phone.

Changed in today-scope:
status: New → Invalid
Łukasz Zemczak (sil2100) wrote :

This bug was fixed in the package qtbase-opensource-src 5.4.1+dfsg-2ubuntu11~vivid4 in https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/stable-phone-overlay

---------------

qtbase-opensource-src (5.4.1+dfsg-2ubuntu11~vivid4) vivid; urgency=medium

  * Lorn Potter's networking fixes:
    - debian/patches/Make-sure-to-report-correct-NetworkAccessibility.patch
    - debian/patches/Make-sure-networkAccessibilityChanged-is-emitted.patch
    - debian/patches/Fix-hang-in-qnam-when-disconnecting.patch
    - debian/patches/Make-UnknownAccessibility-not-block-requests.patch
    - debian/patches/qnam-ubuntu-fix6.patch (not yet upstreamed)
    (LP: #1470700) (LP: #1506015) (LP: #1507769) (LP: #1528886) (LP: #1533508)
  * debian/patches/Add-an-option-to-skip-the-generic-bearer-engine.patch
    - Backport to replace disable-generic-plugin-when-others-available.patch

 -- Timo Jyrinki <email address hidden> Mon, 14 Dec 2015 12:53:41 +0000

Changed in qtbase-opensource-src (Ubuntu RTM):
status: New → Fix Released
Changed in canonical-devices-system-image:
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qtbase-opensource-src - 5.5.1+dfsg-13ubuntu2

---------------
qtbase-opensource-src (5.5.1+dfsg-13ubuntu2) xenial; urgency=medium

  * Forward-port networking fixes from 5.4 series:
    - net-bearer-nm-disconnect-ap-signals7.patch (LP: #1480877)
    - qnam-ubuntu-fix6.patch (LP: #1528886)
    - xenial would potentially now have fixes for (LP: #1506015)
      (LP: #1507769) (LP: #1533508)

 -- Timo Jyrinki <email address hidden> Tue, 09 Feb 2016 08:19:43 +0000

Changed in qtbase-opensource-src (Ubuntu):
status: In Progress → Fix Released
Changed in connectivity-api (Ubuntu):
status: Confirmed → Invalid
Changed in ubuntu-system-settings (Ubuntu):
status: Confirmed → Fix Released
Wangtim (tim-bronkhorst) wrote :

Hello, I never had this bug until I upgraded to OTA10. Since OTA 10 checking for updates never ends, but only when the wifi is on. If I use only the 3G network, checking finishes in a few seconds. Any ideas if I can solve this, it's quite annoying...
I have a BQ E4.5. Thanks !

royden (ryts) wrote :

I can confirm that this also occurs on my Mx4, rc-proposed r303. This is a regression, as it was fixed.

Issue of update check not finishing happens only on wifi for me. Completes fine on GPRS.

SB (emehntehtt) wrote :

This is happening to me on BQ Aquaris E4.5 with OTA-10 for a few days now regardless of whether I use wifi or 3g, checking for updates never finishes, I did a tracepath to system-image.ubuntu.com as recommended to me on the IRC, and my network reaches Canonical server fine, but after that nothing happens. To be precise tracepath reaches:

te2-1.jotunn.canonical.com

And then I get "no reply" over and over again. I removed Ubuntu One account, readded it, rebooted the phone and nothing. I can install applications from the Store, but checking for updates never finishes, or to be more precise, it does "finish" (fails) after a long time with the error message how I should connect to the Internet. Internet otherwise works fine on the phone, phone is not made rw nor it has any special modifications, I run vanilla OTA-10 with no tinkering whatsoever.

SB (emehntehtt) wrote :

It started working, I checked three times in a row and it was successful, will track this further to see if updating continues working properly.

Changed in canonical-devices-system-image:
status: Fix Committed → Fix Released
royden (ryts) wrote :

Yes, now working again on wifi

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers