Client should exp backoff reconnect even when connect was successful

Bug #1309231 reported by John Lenton
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Push Notifications
Fix Released
High
John Lenton
ubuntu-push (Ubuntu)
Fix Released
High
John Lenton
Trusty
Fix Released
High
John Lenton

Bug Description

Client should exp backoff reconnect even when connect was successful. This would protect us from some situations (already arisen: a server bug that disconnects us hard, after accepting the connection).

[Impact]

ubuntu-push-client reattempts connection in a loop, driving load up, as well as disk usage (through the logs) and draining the battery.

[Test Case]

This test case demonstrates the problem by reproducing lp:1309237. You need:

* a computer capable of running the ubuntu push server.
* at least two devices using the stable image and that can talk to the computer over the network

on a computer reachable from the devices, do:

mkdir -p test-case-1309231/src/launchpad.net
cd !$
bzr branch lp:ubuntu-push
cd ubuntu-push
make bootstrap
sed -i~ -e 's/127.0.0.1//g' sampleconfigs/dev.json
make run-server-dev

on the devices, edit /etc/xdg/ubuntu-push-client/config.json (or copy it to ~phablet/.config/ubuntu-push-client/config.json and edit it there) so that "addr" points to the IP address of the computer, and port 9090; something like

"addr": "192.168.1.1:9090"

(note there is no https:// as the hosts discovery step is being skipped).

Reboot the devices. Tailing ~phablet/.cache/upstart/ubuntu-push-client.log will show a series of rapid disconnects and reconnects; the output of the server will show a series of empty "registered" (as opposed to "registered" followed by a 256-byte hash). Look at the load on the devices. Be amazed.

[Regression potential]

If a device hits the connects-ok-but-is-then-disconnected-early situation and then enters deep sleep (or is otherwise unable to deliver a timely multi-second timeout), the client might see significant delays in reconnecting successfully, and that might mean they miss a notification that they would've otherwise received. It's rather far fetched, really, and a similar backoff is already in place for other less pathological paths.

Related branches

Revision history for this message
Lucio Torre (lucio.torre) wrote :

this is trickier than simple backoff with reset on connect, since it should go to some minutes when trying to much, but if we disconnect every, say, 30 minutes, it should still retry at once.

Also, we need to make sure that we dont have any unexpected holes like this somewhere else on the code.

Revision history for this message
John Lenton (chipaca) wrote :

We already do backoff with reset on connect.

John Lenton (chipaca)
Changed in ubuntu-push:
status: Confirmed → Fix Committed
affects: ubuntu → ubuntu-push (Ubuntu)
Changed in ubuntu-push (Ubuntu):
status: New → In Progress
assignee: nobody → John Lenton (chipaca)
importance: Undecided → High
John Lenton (chipaca)
description: updated
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello John, or anyone else affected,

Accepted ubuntu-push into trusty-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/ubuntu-push/0.2.1+14.04.20140423.1-0ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in ubuntu-push (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Lucio Torre (lucio.torre) wrote :

verified with http://ports.ubuntu.com/pool/universe/u/ubuntu-push/ubuntu-push-client_0.2.1+14.04.20140423.1-0ubuntu1_armhf.deb

Patched desktop client with:
http://pastebin.ubuntu.com/7323530/

Then ran both clients, phone went:
2014/04/24 13:50:00.044713 DEBUG trying to connect to: 192.168.3.10:9090
2014/04/24 13:50:00.156296 DEBUG Connected 192.168.3.10:9090.
2014/04/24 13:50:00.157486 DEBUG Session connected after 1 attempts
2014/04/24 13:50:00.157761 ERROR session exited: EOF
2014/04/24 13:50:10.598628 DEBUG trying to connect to: 192.168.3.10:9090
2014/04/24 13:50:10.721045 DEBUG Connected 192.168.3.10:9090.
2014/04/24 13:50:10.721442 DEBUG Session connected after 1 attempts
2014/04/24 13:50:10.730659 ERROR session exited: EOF
2014/04/24 13:50:28.792463 DEBUG trying to connect to: 192.168.3.10:9090
2014/04/24 13:50:28.874105 DEBUG Connected 192.168.3.10:9090.
2014/04/24 13:50:28.874501 DEBUG Session connected after 1 attempts
2014/04/24 13:50:28.896323 ERROR session exited: EOF
2014/04/24 13:52:33.328767 DEBUG trying to connect to: 192.168.3.10:9090
2014/04/24 13:52:33.410683 DEBUG Connected 192.168.3.10:9090.
2014/04/24 13:52:33.411019 DEBUG Session connected after 1 attempts
2014/04/24 13:52:33.420847 ERROR session exited: EOF

which shows the exp backoff is working.

John Lenton (chipaca)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-push - 0.2.1+14.04.20140423.1-0ubuntu1

---------------
ubuntu-push (0.2.1+14.04.20140423.1-0ubuntu1) trusty; urgency=high

  [ Samuele Pedroni ]
  * gave the client the ability to get config from commandline
    ( => easier automated testing) (LP: #1311600)

  [ John Lenton ]
  * Ensure ubuntu-push-client is the only one running in the session.
    (LP: #1309432)
  * Remove supurious numbers in brackets in notifications. (LP: #1308145)
  * Check the server certificate and server name. (LP: #1297969)
  * Loop whoopsie_identifier_generate until it starts working. (LP: #1309237)
  * In the session: set a flag on connect, clear it on successfully
    replying to ping or broadcast messages, check it at the top of
    autoredial. Also track the last autoredial, and set the delay flag if
    autoredial is re-called too quickly. (LP: #1309231)
 -- Ubuntu daily release <email address hidden> Wed, 23 Apr 2014 11:54:00 +0000

Changed in ubuntu-push (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Colin Watson (cjwatson) wrote : Update Released

The verification of the Stable Release Update for ubuntu-push has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-push - 0.2.1+14.04.20140423.1-0ubuntu1

---------------
ubuntu-push (0.2.1+14.04.20140423.1-0ubuntu1) trusty; urgency=high

  [ Samuele Pedroni ]
  * gave the client the ability to get config from commandline
    ( => easier automated testing) (LP: #1311600)

  [ John Lenton ]
  * Ensure ubuntu-push-client is the only one running in the session.
    (LP: #1309432)
  * Remove supurious numbers in brackets in notifications. (LP: #1308145)
  * Check the server certificate and server name. (LP: #1297969)
  * Loop whoopsie_identifier_generate until it starts working. (LP: #1309237)
  * In the session: set a flag on connect, clear it on successfully
    replying to ping or broadcast messages, check it at the top of
    autoredial. Also track the last autoredial, and set the delay flag if
    autoredial is re-called too quickly. (LP: #1309231)
 -- Ubuntu daily release <email address hidden> Wed, 23 Apr 2014 11:54:00 +0000

Changed in ubuntu-push (Ubuntu):
status: In Progress → Fix Released
John Lenton (chipaca)
Changed in ubuntu-push:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.