Leaving Wifi does not connect to mobile carrier data (GSM)
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | Canonical System Image |
Critical
|
Canonical Phone Foundations | ||
| | ofono (Ubuntu) |
Critical
|
Tony Espy | ||
| | ofono (Ubuntu RTM) |
Critical
|
Tony Espy | ||
Bug Description
After leaving my home or work wifi, my Bq phone cannot connect to the data from my mobile carrier. The only thing that works is rebooting the phone.
Steps to reproduce:
1- Connect your phone to password secure Wifi.
2- Suspend phone.
3- Leave wifi area and resume phone
4- Wait for 3g data to kick in (appears 3g indicator instead of wifi)
5- No connection established. I have waited for more than 30 mins.
6- A reboot of the phone fixes this issue.
Really annoying to have to reboot phone 3-4 times a day.
Related branches
- Ricardo Salveti: Approve on 2015-04-14
- Alfonso Sanchez-Beato: Approve on 2015-04-14
-
Diff: 56 lines (+17/-8)2 files modifieddebian/changelog (+10/-0)
drivers/rilmodem/gprs-context.c (+7/-8)
- Alfonso Sanchez-Beato: Approve on 2015-04-23
- PS Jenkins bot: Approve (continuous-integration) on 2015-04-23
-
Diff: 500 lines (+306/-10) (has conflicts)11 files modifieddebian/changelog (+18/-1)
drivers/rilmodem/gprs-context.c (+7/-8)
drivers/rilmodem/sim.c (+61/-0)
drivers/rilmodem/voicecall.c (+11/-1)
gril/grilreply.c (+32/-0)
gril/grilreply.h (+3/-0)
gril/grilrequest.c (+22/-0)
gril/grilrequest.h (+3/-0)
test/create-ia-context (+47/-0)
unit/test-grilreply.c (+56/-0)
unit/test-grilrequest.c (+46/-0)
| Pat McGowan (pat-mcgowan) wrote : | #1 |
| Changed in canonical-devices-system-image: | |
| assignee: | nobody → Canonical Phone Foundations (canonical-phonedations-team) |
| Tony Espy (awe) wrote : | #2 |
@Pat
Unfortunately that's not true as bug #1418077 is for a regression in the new version of network-manager in vivid.
I will give this a try tomorrow. To date, I had always kept the phone alive while walking out of range of an access point. The description here states that the phone was suspended ( or at least the screen turned off ).
| Jean-Baptiste Lallement (jibel) wrote : | #3 |
I confirm the scenario in the description on 14.09-proposed/
I could reproduce by booting the phone, unplugged, in an area with known wifi, so the phone auto-connects and wait until it deep suspends before leaving the wifi area. When you resume the phone the cellular data connection will never establish.
I checked that the phone uses the SIM with the data plan.
The active context returned by list-contexts is correct.
The output of ip route is empty.
| Changed in canonical-devices-system-image: | |
| status: | New → Confirmed |
| tags: | added: lt-category-visible |
| Changed in canonical-devices-system-image: | |
| milestone: | none → ww13-ota |
| importance: | Undecided → Critical |
| Tony Espy (awe) wrote : | #4 |
@Leopoldo
A couple of questions...
1. Do you have more than one SIM installed in your phone?
2. If you're familiar with using adb, or have the terminal installed could you run the following command and paste the output into this bug:
$ sudo grep "Suspended for" /var/log/syslog
3. Finally, could you please try to set your phone to '2g' only mode and see if the problem still occurs?
I'm in the States, so I only have 2g available and am not able to reproduce the bug. Likewise, @Jean-Baptiste also told me earlier today that he was unable to reproduce the problem when his phone was set to 2g.
This bug may be related to bug #1436411 which also describes problems with the phone becoming idle with an active 3g connection.
| Tony Espy (awe) wrote : | #5 |
Also, just an aside, even though I waited close to 30m before waking my phone, I confirmed that it never truly suspended as all of the "Suspended for" messages showed 0.000.
My testing was done with a krillin running image #20 ( ubuntu-
| Tony Espy (awe) wrote : | #6 |
@Jean-Baptiste
Can you attach your syslog, as well as the output from 'list-modems', 'ip route', and 'ifconfig'?
| Jean-Baptiste Lallement (jibel) wrote : | #7 |
| Jean-Baptiste Lallement (jibel) wrote : | #8 |
| Jean-Baptiste Lallement (jibel) wrote : | #9 |
| Jean-Baptiste Lallement (jibel) wrote : | #10 |
| Jean-Baptiste Lallement (jibel) wrote : | #11 |
| Jean-Baptiste Lallement (jibel) wrote : | #12 |
output of ip route is empty.
| Tony Espy (awe) wrote : | #13 |
In discussion with @Jean-Baptiste, when the phone is woken up, the indicator shows a non-connection icon. Other points:
* 'nmcli d' - shows both modems are disconnected
* 'list-modems' - shows the first SIM slot registered to the carrier, however it's ConnectionManager interface shows 'Attached' is 0
* 'list-contexts' - shows an active context with valid IP Settings
This looks like a MTK-specific ofono bug, possibly triggered by a suspend-related modem driver problem.
| Leopoldo Pena (leopenausa) wrote : | #14 |
Hi again,
here are some answers to your questions. This bug just repeated this morning after leaving home wifi. No 3g data connection on the street, and 40 minutes later I still don't have data connection.
I am using the BQ phone with the official repos.
1. Do you have more than one SIM installed in your phone?
No, I only have one SIM card installed.
2. If you're familiar with using adb, or have the terminal installed could you run the following command and paste the output into this bug:
$ sudo grep "Suspended for" /var/log/syslog
Same results as Tony, phone shows that it never truly suspended as all of the "Suspended for" messages showed 0.000.
3. Finally, could you please try to set your phone to '2g' only mode and see if the problem still occurs?
I will give this a try next time I walk out of wifi range.
| Jean-Baptiste Lallement (jibel) wrote : | #15 |
I tried to collect more traces by starting ofono with the following command:
OFONO_RIL_
but I couldn't reproduce the bug.
| Leopoldo Pena (leopenausa) wrote : | #16 |
I am not familiar with adb tool, but if you point me out how to do it properly i can post my logs when the bug repeats, which is typically every single morning walking from home to work. Once there i reboot phone and 3g connectivity is back.
I am on r20 image by the way.
| Tony Espy (awe) wrote : | #17 |
@Jean-Baptiste
So the "-d" flags by itself means enable debug messages for every single source file in ofono. As there have been previous 'racy' issues with the mtkmodem, enabling all these messages may be preventing the problem from happening.
Could you instead try this first:
OFONO_RIL_
If that still fails, then try again without the OFONO_RIL_TRACE.
| Tony Espy (awe) wrote : | #18 |
@Leopoldo
Have you installed all available system updates?
adb is a command that you can run from a latop to pull files from a device. Do you have a laptop available, and if so, what OS does it run?
| Jean-Baptiste Lallement (jibel) wrote : | #19 |
I tried without -d and without OFONO_RIL_TRACE and each time the problem goes away and I cannot reproduce the bug. Is there a way to set these option directly on boot, so I don't have to restart ofono just in case it matters.
@Jean-Baptiste, yes, you can modify /etc/init/
| Tony Espy (awe) wrote : | #21 |
@Jean-Baptiste
But again, I would not add "-d" by itself, but use the more qualified version I included in comment #17.
| Changed in canonical-devices-system-image: | |
| milestone: | ww13-ota → ww17-2015 |
| Laryllan (laryllan) wrote : | #22 |
I can reproduce this bug by just disabling WiFi.
The indicator shows a HSDPA network, but the phone remains offline.
When using 2G only, the network connection changes to 2G and is working.
My carrier is o2 germany.
| Ricardo Salveti (rsalveti) wrote : | #23 |
Attached the screenshot of the terminal app when I got to reproduce this issue during the weekend.
The interesting issue is that the route was empty, device showing as connected but no interface when calling ifconfig.
current build number: 270
device name: krillin
channel: ubuntu-
last update: 2015-04-11 08:14:05
version version: 270
version ubuntu: 20150410.1
version device: 20150408-4f14058
version custom: 20150409-665-29-206
| Ricardo Salveti (rsalveti) wrote : | #24 |
Will keep debug enabled at both ofono and networkmanager and see if I can get to reproduce it again.
| Changed in ofono (Ubuntu RTM): | |
| assignee: | nobody → Tony Espy (awe) |
| importance: | Undecided → Critical |
| status: | New → In Progress |
| Tony Espy (awe) wrote : | #25 |
OK, I think I figured it out after looking at a radio log of the problem reproduced by Alfonso.
First, there are a couple of MTK-specific unsolicited events that are received that look suspicious:
RIL_UNSOL_
UNSOL_RESTRICTE
RIL_UNSOL_
UNSOL_RESTRICTE
UNSOL_DATA_
Turns the mtkmodem code overrides the normal NETWORK_
That said, I also see a UNSOL_DATA_
01 00 00 00 F2 03 00 00 09 00 00 00 00 00 00 00
01 00 00 00 = unsolicted
F2 03 00 00 = UNSOL_DATA_
09 00 00 00 = version
00 00 00 00 = number of calls
The function ril_gprs_
I imagine this may be the same bug as bug #1436427 ( which has been a theory all along ).
I'm working on a fix now, and should have something available to test by tomorrow.
| Changed in ofono (Ubuntu): | |
| status: | New → In Progress |
| importance: | Undecided → Critical |
| assignee: | nobody → Tony Espy (awe) |
@Tony, great catch! I have tested your changes and reproduced the bug. Now list-contexts does not show any active context when ccmni0 is down.
However, NM is not able to recover from this situation. It tries to add routes before we get the UNSOL_DATA_
Please see attached log. To me these lines are interesting:
Apr 14 07:48:54 ubuntu-phablet nm-dispatcher: Dispatching action 'down' for wlan0
...
Apr 14 07:48:56 ubuntu-phablet kernel: [ 58.774157]
...
Apr 14 07:48:56 ubuntu-phablet NetworkManager[
...
Apr 14 07:48:56 ubuntu-phablet ofonod[1842]: [0,UNSOL]< UNSOL_DATA_
So we are near, but more changes are needed.
Some more traces when reproducing the bug:
/system/bin/logcat -v threadtime -b main -b radio
...
04-14 09:31:21.057 1844 1861 D use-Rlog/RLOG-AT: +CGEV: NW DEACT "IP", "100.125.255.239", 1
...
04-14 09:31:21.058 1844 1861 D use-Rlog/RLOG-RIL: configureNetwor
04-14 09:31:21.105 1844 1861 D DHCP : ifc_down(ccmni0) = 0
04-14 09:31:21.106 1844 1861 D DHCP : ifc_set_
...
The "NW DEACT" event means that the network has forced a deactivation. The interesting thing is that I get this when I deactivate WiFi, but only with one of my SIMs. The reason might be that I do not have enough credit to enable the data plan for that SIM. Why this happens exactly when I disable WiFi, I don't know. Maybe the connection had already been dropped by the operator and the modem notices at that time for some weird reason. Anyway, this event is the same you would receive if data is dropped for other reasons, so I hope this reproduces the bug.
| Tony Espy (awe) wrote : | #28 |
So in theory, NM should react to the context 'Settings' property changing to empty, and tear down the connection.
For vivid testing, it's important however that the most recent version of NM which fixes the NM 5m reconnect bug #1418077 is used. This version is currently in the following silo:
https:/
...but as it's been marked PASSED by QA, it should be landing in the archive shortly.
We're currently testing RTM to see whether or not the same issue exists with NM.
| Tony Espy (awe) wrote : | #29 |
Turns out pending version of NM was used for the vivid testing mentioned in comments #26 and #27.
| tags: | added: bq hotfix |
| Tony Espy (awe) wrote : | #30 |
Here's the latest update on what we know about this bug.
First, we have working fix for ofono's behavior when a DATA_CALL_
Second, Alfonso has pointed out that there may be additional network-manager changes required. His original testing performed on krillin/vivid, and as there's still a race between the disconnect from ofono, and NM changing the default route, it's still possible for an error to be thrown when NM configures the default route for the modem. When this happen, NM leaves the connection in an inconsistent state. When he tried the same scenario on krillin/RTM plus the patched ofono, it worked. I believe he may have just been lucky, and the disconnect happened before the default route logic ran. This is a guess, so we have some more work to in figuring this out.
As these fixes are critical enough to warrant a hot-fix to RTM, we're going to concentrate on it first, and then address vivid.
| Tony Espy (awe) wrote : | #31 |
I think the issue is the the nm_system_
I've added a preliminary fix for this, by adding logic to check for NULL and immediately return FALSE. instead of continuing to iterate the through any remaining routes to be added. Note, this may need to be re-worked to ensure that route object references are all properly cleaned up, but it should suffice for testing the fix.
A version of this first attempt at patching can found in my PPA:
https:/
Note, as it's a FIX for RTM, it's the 0.9.8 version.
Side note: I have created bug #1444314 to address a weird thing I have seen in the files attached in comments #7 and #8.
| Tony Espy (awe) wrote : | #33 |
Please disregard comment #31, as it's incorrect and discusses the wrong routing code.
| Tony Espy (awe) wrote : | #34 |
@Alfonso
So looking closer at NM 0.9.8, the routing is updated by nm-policy.c in its device_
I hacked nm_system_
Next I deactivated the context manually, and NM turned around and re-established the connection, so I *think* this proves that the scenario you described in comment #26 isn't happening with NM 0.9.8.
At this point, as changing the NM logic isn't super straight-forward, I would suggest that we don't attempt to patch NM for utopic-RTM at this point, unless someone else is able to prove for certain that NM can get wedged in a bad state when the routing configuration fails.
@Tony, ok, probably after applying the ofono patch it is difficult to have this happening in RTM so it is not worth the effort. But I think we should fix this in vivid / NM 0.9.10, if the routing logic is still the same.
| Tony Espy (awe) wrote : | #36 |
@Alfonso
To be clear, I attempted to simulate the failure by hard-coding a route failure on the 3rd route switch ( when WiFi is toggled off, NM replaces the route with the modem's interface ). This definitely left the connection inconsistent with the actual routing table, but I next was able to deactivate the context from the command-line, and NM successfully restarted the connection. So again, I don't think the scenario you describe in comment #26 is possible on RTM, so we'll just release the ofono patch.
I will try to force the scenario on vivid and see if I can reproduce the problem you described with NM not being able to re-establish the connection after a route failure before disconnect.
| Zygmunt Krynicki (zyga) wrote : Re: [Bug 1435328] Re: Leaving Wifi does not connect to mobile carrier data (GSM) | #37 |
Hey.
I'm using latest RTM and I still routinely reboot after leaving my
wifi network. Turning off wifi through the indicator doesn't help
(that's what I do to save battery life) and just keeps the phone
disconnected for hours.
On Tue, Apr 21, 2015 at 6:25 PM, Tony Espy <email address hidden> wrote:
> @Alfonso
>
> To be clear, I attempted to simulate the failure by hard-coding a route
> failure on the 3rd route switch ( when WiFi is toggled off, NM replaces
> the route with the modem's interface ). This definitely left the
> connection inconsistent with the actual routing table, but I next was
> able to deactivate the context from the command-line, and NM
> successfully restarted the connection. So again, I don't think the
> scenario you describe in comment #26 is possible on RTM, so we'll just
> release the ofono patch.
>
> I will try to force the scenario on vivid and see if I can reproduce the
> problem you described with NM not being able to re-establish the
> connection after a route failure before disconnect.
>
> --
> You received this bug notification because you are subscribed to a
> duplicate bug report (1419897).
> https:/
>
> Title:
> Leaving Wifi does not connect to mobile carrier data (GSM)
>
> To manage notifications about this bug go to:
> https:/
| Tony Espy (awe) wrote : | #38 |
@Alfonso
Regarding comment #26, to be clear, your exact scenario was this:
1. Install SIM with no data credit in slot1; this connection comes up, but any attempt to send/receive data fails
2. Activate WiFi, and wait till it associates with an AP
3. Disable WiFi
( this forces a disconnect of the modem, and the corresponding route configuration fails )
What I see from the logs is the Netlink failures, at which point I also see the NM device state go from activated to failed ( reason: ip-config-
Looking at the NM ofono activation code, although the error printed is "could not activate context, modem is not registered', the code actually just checks whether the modem is in REGISTERED state when it tries to connect. It's possible that the modem state is still CONNECTED, and somehow this didn't get reset to REGISTERED when the Netlink errors occurred? The modem state is only set back to REGISTERED in disconnect_done () and update_
If you can still reliably reproduce, could you turn on NM debug logging ( sudo nmcli general logging level debug ) and try to reproduce again and provide system logs. I will also try to force a modem disconnect ( via tinfoil wrapping ), and see what happens on vivid.
This theory also provides more weight to the thought that this is a vivid-only issue with NM, as the ofono modem state code is radically different in NM 0.9.8.
@Tony
No, I cannot reproduce anymore easily. That SIM has now credit, and even when it did not have the sequence in #26 was not happening any more after a couple of days.
What I interpret from syslog in #26 is:
1. Apr 14 07:48:54 ubuntu-phablet nm-dispatcher: Dispatching action 'down' for wlan0
-> We are out of reach of WiFi, wifi is in disconnect state for NM
2. Apr 14 07:48:56 ubuntu-phablet kernel: [ 58.774157]
-> For some reason/coincidence immediately after wifi is disconnected the modem discovers the current cellular data context is not valid anymore and rild shuts down interface ccmni0. ofono still does not know this has happened, so it still considers the context is active.
3. Apr 14 07:48:56 ubuntu-phablet NetworkManager[
-> all ofono properties show that ccnmi0 is up and running so NM tries to create a default route that uses ccmni0, but it fails because ccmni is actually down.
4. Apr 14 07:48:56 ubuntu-phablet ofonod[1842]: [0,UNSOL]< UNSOL_DATA_
-> rild notifies ofono about a change in the state of the IP contexts, ofono deactivates the context at that moment.
The "coincidence" between WiFi disconnecting and modem shuting down ccmni0 produces this unfortunate intermix of events. Usually NM would have not tried to establish routes right between ccmni0 being shut down by rild and ofono noticing that. But, anyway NM should be able to recover after the failure to set up routes and it should have tried to activate the context once ofono signals there has been a deactivation.
| Tony Espy (awe) wrote : | #40 |
@Alfonso
So I agree that the modem disconnect seen when WiFi is toggled off is rare, that said, per my previous comment #38, it looks NM definitely gets in a state where it doesn't recover. I didn't see where you specified how long you waited after the disconnect; it's possible that this is a permutation of bug #1418077, but as I didn't see a state message showing the modem transition to "searching", I don't think this is the case.
This may just be a timing issue. In this case, the route failures cause the connection failure, which is different than a connection failing due to the context being de-activated. Maybe it's this case which leaves the modem in 'connected' state?
I will try my force-route failure experiment on krillin/RTM and see if it triggers the same scenario.
Also, this may be related to the FM failures seen in RTM on arale and mako; see bug #1445080 for details.
| Changed in canonical-devices-system-image: | |
| status: | Confirmed → In Progress |
| Launchpad Janitor (janitor) wrote : | #41 |
This bug was fixed in the package ofono - 1.12.bzr6894+
---------------
ofono (1.12.bzr6894+
[ Tony Espy ]
* rilmodem/
If a DATA_CALL_
empty call list, and the gprs-context is not IDLE,
disconnect the active data call.
-- CI Train Bot <email address hidden> Wed, 15 Apr 2015 13:27:00 +0000
| Changed in ofono (Ubuntu RTM): | |
| status: | In Progress → Fix Released |
| eotakos (eotakos) wrote : | #42 |
I tried to ./configure the package with the fix, giving the parameters specified in the readme file, but still I'm getting no such file or directory error. ( I did install GCC , plus the glib and d-bus libraries as asked).
Anything else I maybe doing wrong?
Thanks and sorry if there's something obvious that I'm missing.
| Tony Espy (awe) wrote : | #43 |
@eotakos
Providing instructions for how to build the package from scratch are outside of the scope of this bug. You'd need to install all of the build dependencies for ofono, most of which aren't available from the default root filesystem. The only safe way to do this is to install a chroot on your device, but again this is beyond the scope of this bug.
I think you're much better off holding off for a few more days as the pending fix is currently staged, and has recently been approved by QA. It should hopefully land *soon*, although I can't give you an exact date.
If you've made your device writable, and wish to install the package directly, you could use apt to install it from the RTM archive, however doing so, could prevent future updates from working properly.
You could in theory extract the binary ( ofonod ) from the package, and run it in the foreground in /tmp, however again I think you're better off waiting for the update. The version of ofono with the fix is: 1.12.bzr6894+
| Changed in ofono (Ubuntu): | |
| status: | In Progress → Fix Released |
| Tony Espy (awe) wrote : | #44 |
Marked as FixReleased for vivid, as a new version has landed in our new overlay PPA: 1.12.bzr6894+
| Changed in canonical-devices-system-image: | |
| milestone: | ww17-2015 → ww19-ota |
| Changed in canonical-devices-system-image: | |
| status: | In Progress → Fix Committed |
| Changed in canonical-devices-system-image: | |
| status: | Fix Committed → Fix Released |

I believe this is related to bug 1418077