High CPU load (softlock) when connecting through SOCKS proxy

Bug #1248106 reported by Benjamin Schmid
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Polly
Invalid
Undecided
Unassigned

Bug Description

I'm on polly daily and I do experience intermittent softlocks of Polly with high cpu load (180% and more on quadcore). Actually I realize them only, because the fan starts to rev up.

In these cases the GUI typically still shows up, but shows some animation icons (twitter avatars never show up there at all), but "Quit" has no effect. So I need to kill the process manually.

I'm sitting behind a SOCKS proxy with a somewhat unstable connection. I suspect that this situation arises, when a connection starves or drops and polly still busy-waits/fetches for results instead of running into a quick timeout, no?!

Can I provide anything to support the bug analysis like extracting some live stacktrace?

Revision history for this message
Conscious User (conscioususer) wrote :

There is a quick timeout, or at least there should be.

I suppose you configured Polly (Edit->Preferences) to use the SOCKS proxy you mentioned, right? What happens when you deliberately enter a wrong proxy configuration?

Changed in polly:
status: New → Incomplete
Revision history for this message
Benjamin Schmid (benbuntu) wrote :

On wrong proxy settings just an empty window occurs and the console reports reconnection trials to my two twitter accounts in a fast-paced, slowly increasing interval. CPU stays low, though.

After fixing settings, tweets appear and many "wait indicator" animation graphics, partly disappearing into blanks, partly staying alive forever. For me it seems that fetching the avatar/icons is not done properly over the configured proxy.

In this state, polly starts to eat CPU (58%).

In most of the times when I open the Polly UI these wait animations just disappears. But not in every case. In these cases I always have to do a restart of the application.

One note: My original report addressed that polly intermittently starts to CPU when it's minimized to icon state. I'm not sure how the problem described in this comment relates to the issue described in the original report.

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

Just a short amend: I'm sitting behind a corporate firewall and use a simple SSH tunnel to proxy my traffic. So a simple 'ssh -D 7070 myhost' is what I use as SOCKS5. Combined with a firewall rule, probably very easy to setup the scenario of a proxy user as me for development purposes, too.

I really love Polly and would like to get this issue fixed. Any further things I can provide to identify & solve this issue?

Revision history for this message
Conscious User (conscioususer) wrote : Re: Avatars not being loaded when connecting through SOCKS proxy

The CPU usage is probably due to the redrawing needed by the throbber animations.

Is it possible for you to test without the proxy?

summary: - Polly regulary gets stucked at high cpu load (180% CPU, softlock)
+ Avatars not being loaded when connecting through SOCKS proxy
Revision history for this message
Benjamin Schmid (benbuntu) wrote :

Just quickly used sshuttle to transparently set up a poor-mans VPN.

After removing the proxy settings in polly it did not work at first. Messages appeared, but icons not.

I had to unset the http/https/ftp_proxy setting in the environment. Then the icons immediately appeared w/o any delay.

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

The problem still exists and got even worse after subscribing to more Twitter feeds. Now I have to kill & restart polly around every 20 to 90min due to eating high cpu even in minimized mode.

Is there a way how I can help / confirm the assumption or assist in fixing it?
I'm even as desperate that I was even thinking about killing/restarting polly automatically using monit https://serverfault.com/a/97566

summary: - Avatars not being loaded when connecting through SOCKS proxy
+ High CPU load (softlock) when connecting through SOCKS proxy
Revision history for this message
Conscious User (conscioususer) wrote :

If the issue is not happening when a proxy is not being used, we should start to investigate if the issue is not on the SocksiPy library. Do you have enough privilege to add some debug lines to it?

I notice I never asked you if something out of the ordinary is printed when you start Polly from the terminal, please check that.

Also, did you check if the CPU usage spikes if the loading animations are running? If that's the case, the bug title should be changed back since the CPU usage is simply a consequence of the main issue.

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

Nothing out of order printed from terminal.

The issue/pain point for me is rather the CPU/softlock, that's why I changed the issue subject. Honestly, I wouldn't care if sometimes Avatars don't load, unless it recovers. Most of the times Avatars do load

I adapted https://code.google.com/r/jonasthiem-glock/source/browse/lib/pychess/System/stacktracer.py and got this stacktrace during a state, where I already quit polly but the process hangs and still consumes 25% CPU.

Please refer to the attached file.

Revision history for this message
Conscious User (conscioususer) wrote :

My point is that the CPU load might be *caused* by the avatars not loading, due to the animations. A bug title should be accurate and reflect the main, base cause, not which of its consequences you specifically dislike the most.

Your stacktrace shows one thread stuck in the SocksiPi library, like I speculated. You didn't answer whether you have enough privileges to mess with it a bit.

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

'Privilege' in the sense of 'Python programming competence' (totally new to it) or 'System privileges' (aka 'root')?
Later: Yes. I assumed that became implicitly clear after modifying the installed Polly source to achieve the stack monitoring.

The root cause clearly seems to be the SOCKS connection, therefore I kept that part in the title ;-).
I didn't want to be inpolite. It's only that in n my view a process intermittently starting to constantly eat CPU, blocking the whole application (no feed updates, no application exit possible, aka softlock) is an important information for triaging a bugs severity and content. But sure - matter of personal taste. Please change whatever you like. Sorry for that.

So - any hints what to do?

My rough guess would be:
a) SOCKS connection attempts time out to fast & don't retry. -> Empty "Avatars"
b) Established SOCKS connection don't time out at all even if stalled/broken -> CPU load due to busy waiting.
c) Blocked application state due to threads blocking/waiting for each other -> Softlock

Revision history for this message
Conscious User (conscioususer) wrote :

Oh, don't get me wrong, it's not about politeness or taste, it's about focusing on the root cause instead of the consequences to keep the reports as organized as possible.

I'm talking about system privileges. I wasn't sure because some of your tests could've been made in the standalone tarball install.

My personal guess right now is that the SOCKS handshake is miscalculating the amount of bytes it needs to receive, thus getting stuck during that recv indicated in the stacktrace. I'll peek the SocksiPy source later today to see if I can be more accurate on this.

Revision history for this message
Conscious User (conscioususer) wrote :

Benjamin, could you debug /usr/lib/python2.7/dist-packages/socks.py to confirm that the code is getting stuck at line 228?

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

Without any further action I can confirm that the code gets stuck at that place, simply because the stacktraces are stable for minutes in that area (attached another example I'm currently experiencing right now).

I'm an new to debugging python apps. If I can assist you with more debugging a short link on how to start would be great (via gdb? or println?)

Revision history for this message
Benjamin Schmid (benbuntu) wrote :
Revision history for this message
Conscious User (conscioususer) wrote :

We could try something simple and simply add some print lines to socks.py. Right now I'd like to see the exact data that is being traded via sendall and __recvall. We need to check if it's following the SOCKS protocol correctly.

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

Great! Can you help me out with a code fragement template? I'm esp. wondering where how to overload 'sendall'?

Revision history for this message
Conscious User (conscioususer) wrote :

I don't think you need to go as far as overriding, wrapping the relevant calls in print lines would be enough.

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

I upgraded from nadia (quantal) to petra (saucy) a hew hours before which upgraded python-socksipy from 1.0-1 to 1.0-2.

I might be wrong but currently everything seems to work great: I'm seeing *all* avatars (rarely happened before) and the code of /usr/lib/python2.7/dist-packages/socks.py looks significantly different now....

Revision history for this message
Benjamin Schmid (benbuntu) wrote :

Upgrading python-socksipy (via distribution upgrade) completely solved this issue.

Changed in polly:
status: Incomplete → Fix Released
Revision history for this message
Conscious User (conscioususer) wrote :

Thanks for the report. I won't use "Fix Released" if the fix is not actually on the Polly codebase, though.

Changed in polly:
status: Fix Released → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.