Wrong WWAN value in saved-states if ofono 'Online' toggle fails

Bug #1321627 reported by Alfonso Sanchez-Beato on 2014-05-21
118
This bug affects 18 people
Affects Status Importance Assigned to Milestone
urfkill (Ubuntu)
Critical
Tony Espy
urfkill (Ubuntu RTM)
Critical
Tony Espy

Bug Description

Sometimes the WWAN value in /var/lib/urfkill/saved-states is incoherent with the flight mode status. Note that in this case we have dual SIM (I have not tried yet with just one modem, could happen too).

Start with flight mode=0. soft=false in all sections in saved-states.
/usr/share/urfkill/scripts/flight-mode 1
# restart to write values in saved-state
restart urfkill
/usr/share/urfkill/scripts/flight-mode 0
# restart to write values in saved-state
restart urfkill

The modem is not onlined. soft=true for WWAN in saved-states, whilst the remaining sections have the right value:

# cat /var/lib/urfkill/saved-states
[WLAN]
soft=false

[BLUETOOTH]
soft=false

[UWB]
soft=false

[WIMAX]
soft=false

[WWAN]
soft=true

[GPS]
soft=false

[FM]
soft=false

[NFC]
soft=false

[ALL]
soft=false

This bug is triggered by a failure of an ofono SetProperty 'Online' failure. In this case, the MTK modem was involved.

syslog attached. urfkilld executed with -d in upstart script.

Workaround:

It should be possible to restore WWAN state by either:

- deleting the saved-states file and rebooting

OR

- stop urfkill
- edit the file and change the [WWAN] soft=true to soft=false
- start urfkill

An interesting line in syslog is:

URfkill[3014]: <warning> Could not set Online property in oFono: Timeout was reached

In the case of MTK modem, onlining the modem after powering it on takes ~8 seconds, so probable the timeout has to be increased (maybe to ~20 seconds to be on the safe side).

However, even if there is a timeout the values written in the file should reflect what the user wanted, as the only way get back to normal state is to use /usr/share/urfkill/scripts/block to activate the modem.

Not necessarily.

The saved-states file is only necessary for persistence across reboots. That it's not always 100% up to date is irrelevant, because you'll also should have UI available to toggle this. Writing on shutdown is sufficient for this purpose. I'm not against writing more often, but it just isn't critical.

There's always the possibility that it takes time to offline or online the modem. If you toggle it and immediately stop urfkill, that would explain the discrepancy. It currently writes the actual, checked and proven state of the modem(s) to file rather than what urfkill believes it should be, since things could happen to the modem outside of urfkill.

Could you do the testing and let me know how the number of modems affect delays?

Setting to triaged, this needs a bit of thought about how we can accomodate multiple modems and the delays that can ensue.

Changed in urfkill (Ubuntu):
status: New → Triaged
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
importance: Undecided → High

In fact I did not restart immediately urfkill, I waited for around a minute and did some checks between each command in the bug description.

The problem is not that the file is written just on shutdown, although I would argue that it makes sense to write it each time the user modifies a setting, to have the right values in the file in case urfkill crashes or the phone is switched off abruptly (say, removing the battery).

The issue is that you say that you write to the file the current state of the modem, not the *desired* state. IMO this is wrong, as it looks like urfkill uses the values in the file when it starts to set the state for the devices: if that is the case, the file should have the values that the user configured. Otherwise, when there is a problem with one device, we will not try to move it to the state configured by the user, but to the state it had when urfkill stopped. This is what is happening.

So I would say that there are 2 issues:

1.- Timeout when onlining the modem. This happens because MTK modems take more time to power on, and can be fixed easily incrementing this time. Answering your question, it is not related to having more than one modem: I tried with one modem and the behavior was the same.

2.- saved-states stores the current state of the devices instead of the configured state, so we do not get the right behavior when rebooting after a problem with a device

Pat McGowan (pat-mcgowan) wrote :

My mako was in this state, the save-states file had WWAN soft=true and this value was never changed despite bringing the modem up, turning off flight mode, etc. This happened since the orginal landing of the flight mode capability, I never ran any commands, it was just defaulted to having the modem off.

I deleted the file and now I have the modem up at boot

Tony Espy (awe) wrote :

Bumped this to critical as it can result in someone not having use of their phone for emergency calls.

Changed in urfkill (Ubuntu):
importance: High → Critical
Vincent Ladeuil (vila) wrote :

I ran into this issue during the week end and indeed completely lost carrier selection.

abeato pointed me to /usr/share/ofono/scripts/online-modem that restored the correct carrier.

Thanks for that but this needs a better fix than requiring the user to run a script ;)

Vincent Ladeuil (vila) wrote :

Argh, updated to #77 and didn't realize I need to use the script again :-/ Looks like that will be case for every reboot right ?

Timeout indeed could be increased, but it's currently supposed to be set at 25 seconds (the default for g_dbus_proxy's).

As for saving the states, I'll think about it a bit, try to figure out why it was set the way it was so far.

Ideally however, there should be two bugs if there are two separate issues, that makes it easier to track what's done and what isn't. It's also better to use "ubuntu-bug <package>" to file bugs so we know what versions are affected, and get extra debugging info (like the syslogs I almost always ask for) added to the bug directly.

Vincent Ladeuil (vila) wrote :

@Mathieu: Not sure what those two bugs are. Should I file a new one with 'ubuntu-bug urfkill' for the part that requires me to run /usr/share/ofono/scripts/online-modem after every reboot ?

Please disregard the timeout part here, as that does not affect nexus 4 (I will open a different bug for that). This bug is for the wrong value for WWAN in saved states.

Iain Lane (laney) wrote :

I filed bug #1329734 as requested

Alexander Sack (asac) wrote :

maybe interesting or maybe red herring:

14:25 < asac> cyphermox: grepping through urfkill its odd that there are places where there is WLAN, but not WWAN in the switch/case codes
14:26 < asac> cyphermox: i guess thats (one of) the reason for the weird WWAN state bugs
14:26 < asac> cyphermox: see http://paste.ubuntu.com/7638647/
14:27 < asac> tests/killswitch-write.c:Iwlan = urf_killswitch_new (URF_ENUM_TYPE_WLAN);
14:27 < asac> src/urf-arbitrator.c:IItype = RFKILL_TYPE_WLAN;
14:27 < asac> src/urf-input.c:IIIIcase KEY_WLAN:
14:27 < asac> src/urf-daemon.c:Icase KEY_WLAN:
14:27 < asac> src/urf-daemon.c:IItype = RFKILL_TYPE_WLAN;
14:28 < asac> those are the ones that only exist on WLAN not WWAN
14:28 < Wellark> asac: one reason for that might be that WWAN on n4 is not an actual HW kill switch
14:28 < asac> could be :)
14:28 < asac> just pointing out

Tony Espy (awe) wrote :

I was able to reproduce a similar problem on mako following Alfonso's instructions. I ended up with the following states:

root@ubuntu-phablet:~# cat /var/lib/urfkill/saved-states
[ALL]
soft=false

[WLAN]
soft=true

[BLUETOOTH]
soft=true

[UWB]
soft=false

[WIMAX]
soft=false

[WWAN]
soft=false

[GPS]
soft=false

[FM]
soft=false

[NFC]
soft=false

However when I run the kernel rfkill script, neither BT or WLAN are actually blocked:

root@ubuntu-phablet:~# rfkill list
0: phy0: Wireless LAN
 Soft blocked: no
 Hard blocked: no
1: hci0: Bluetooth
 Soft blocked: no
 Hard blocked: no

Without logs, I can't know what happened. Tony, do you still have the logs for that testing run?

Sayantan Das (sayantan13) wrote :

For me on #87, by default the GSM signals are turned off. I am using the Android/Ubuntu dual boot app provided by Canonical

Also, following

root@ubuntu-phablet:~# cat /var/lib/urfkill/saved-states
[ALL]
soft=false

[BLUETOOTH]
soft=false

[WLAN]
soft=false

[UWB]
soft=false

[WIMAX]
soft=false

[WWAN]
soft=false

[GPS]
soft=false

[FM]
soft=false

[NFC]
soft=false

Sayantan Das (sayantan13) wrote :

I had to carry out the following manually to get the GSM signals to work

/usr/share/urfkill/scripts/flight-mode 1
# restart to write values in saved-state
restart urfkill
/usr/share/urfkill/scripts/flight-mode 0
# restart to write values in saved-state
restart urfkill

taiebot65 (dedreuil) wrote :

I think i am hitting this bug everyday while i am at work. Because only a hard reboot gives me back my gsm after leaving work.

Anyway to cut it short i have poor to really bad GSM reception at work. Almost two to three times a day i completely lose ability to call or send text and only a hard reboot solves the problem so i think i am hit bit this issue. I think if you tell me what logs i need to collect i would be able to give you the logs in order to solve this "annoying" issue.

Tony Espy (awe) on 2014-07-01
summary: - Wrong WWAN value in saved-states
+ Wrong WWAN value in saved-states if ofono 'Online' toggle fails
description: updated
Tony Espy (awe) on 2014-07-07
Changed in urfkill (Ubuntu):
status: Triaged → Fix Committed
Tony Espy (awe) wrote :

Changed Status to FixCommitted via the following two pull requests:

https://github.com/cyphermox/urfkill/pull/4
https://github.com/cyphermox/urfkill/pull/5

Tony Espy (awe) wrote :

@taiebot65

I think the crucial evidence for this bug is the saved_states file per the description.

We should be landing a new version of urfkill this week with fixes for this bug and a separate issue with DBus signals that affected the Touch Flight Mode UI ( which hopefully will land at the same time ). Once a silo is created, we'll add a link to this bug so that you, or others can help test.

Tony Espy (awe) wrote :

A silo has been created today for urfkill. This includes fixes for this bug, and some additional changes needed by the pending Flight-Mode system settings UI.

Here's the silo link, please feel free to download this version of urkfill and report back your results:

https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/landing-014

Elias K Gardner (zorkerz) wrote :

Thanks for getting me here Tony. It looks like I'm getting this too.

Yes it has always worked until this weekend possibly when I flashed #113 with 'ubuntu-device-flash --channel=ubuntu-touch/utopic'

$ adb shell cat /var/lib/urfkill/saved-states -> Everything is is false except WWAN

$ adb shell /usr/share/ofono/scripts/list-modems
[ /ril_0 ]
    Powered = 1
    Model = Fake Modem Model
    Lockdown = 0
    Emergency = 0
    Manufacturer = Fake Manufacturer
    Features = sim
    Revision = M9615A-CEFWMAZM-2.0.1700.48
    Interfaces = org.ofono.SimManager
    Serial = 355136053322924
    Type = hardware
    Online = 0
    [ org.ofono.SimManager ]
        Present = 0

In case it is heplufl I am using a pre-paid SIM from Airvoice Wireless using the AT&T network the US.

Tony Espy (awe) wrote :

So it looks like it's still possible to hit this bug even with the updated urfkill from silo-0014 ( mentioned in comment #21 ).

Martti reported bug #1339794 while testing the pending Flight Mode UI which seems to cause the modem to stop responding to Online/Offline requests if FM is toggled too quickly. I traced this to urfkill not handling InProgress DBus errors from ofono.

It also turns out that when this occurs, it can also end up leaving WWAN soft=true, whereas all the remaining soft switches are false.

That said, the behavior of urfkill is much improved, and the basic FM UI works.

Tony Espy (awe) wrote :

For reference, the silo mentioned in comment #21 landed on 20140711. The first image this package was seeded in was #u127:

http://people.canonical.com/~lzemczak/landing-team/127.commitlog

Also, as I'm still able to re-create the bug ( see comment #23 ), changed Status back to Triaged.

Changed in urfkill (Ubuntu):
status: Fix Committed → Triaged
Tony Espy (awe) on 2014-07-17
description: updated
tags: added: lt-category-visible lt-prio-low
Ricardo Salveti (rsalveti) wrote :

Moving to high as one piece already landed and it's way harder to reproduce the issue with latest image.

Changed in urfkill (Ubuntu):
importance: Critical → High
I Ahmad (iahmad) wrote :

I have experienced the same issue on krillin when toggled the flight mode couple of times.

Sayantan Das (sayantan13) wrote :

I can reproduce it on Nexus 5 every day even on a clean install..

Tony Espy (awe) on 2014-08-08
Changed in urfkill (Ubuntu):
assignee: Mathieu Trudel-Lapierre (mathieu-tl) → Tony Espy (awe)
status: Triaged → In Progress
Tony Espy (awe) on 2014-08-09
Changed in urfkill (Ubuntu):
importance: High → Critical
Thomas Strehl (strehl-t) on 2014-08-13
tags: added: rtm14
Tony Espy (awe) on 2014-09-09
Changed in urfkill (Ubuntu):
milestone: none → later
milestone: later → none
Tony Espy (awe) on 2014-09-09
tags: added: touch-2014-09-18
Tony Espy (awe) wrote :

Updated milestone as landing the required fix requires additional fixes to indicator-network.

There also have been some krillin-specific timeout issues that have taken longer to debug than expected.

tags: added: touch-2014-09-25
removed: touch-2014-09-18
Tony Espy (awe) on 2014-09-29
Changed in urfkill (Ubuntu):
status: In Progress → Fix Committed
Changed in urfkill (Ubuntu RTM):
status: New → Fix Committed
importance: Undecided → Critical
Tony Espy (awe) on 2014-09-29
Changed in urfkill (Ubuntu RTM):
assignee: nobody → Tony Espy (awe)
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package urfkill - 0.6.0~20141007.235123.f908aff.1-0ubuntu1

---------------
urfkill (0.6.0~20141007.235123.f908aff.1-0ubuntu1) utopic; urgency=medium

  * New release snapshot:
    - Asynchronous support for rfkill operations. (LP: #1321627, #1339794)
    - Improvements to state persistence. (LP: #1354716)
    - Support for devices driven by libhybris rather than rfkill.
  * debian/patches/ignore_input_monitor_startup.patch: dropped, included
    upstream.
  * debian/control:
    - bump Build-Depends on libglib2.0-dev to >= 2.36 for GTask.
    - add a Build-Depends on libhybris-dev for hybris-driven devices support.
    - bump Standards-Version to 3.9.5.
  * debian/scripts/enumerate: handle the new hybris device type.
  * debian/rules: remove SysV init startup links.
 -- Mathieu Trudel-Lapierre <email address hidden> Tue, 07 Oct 2014 23:53:07 -0400

Changed in urfkill (Ubuntu RTM):
status: Fix Committed → Fix Released
Changed in urfkill (Ubuntu):
status: Fix Committed → Fix Released
Patrick Hetu (patrick-hetu) wrote :

Just got this on rtm16 and Alfonso's workaround worked for me. Not sure if this is a regression.

@Patrick, would you mind attaching /var/log/syslog , and possible urfkill crash files in /var/crash ? Also, could you please describe what were you doing when this happened?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments