Occasional hang with 100% CPU use when opening dialog windows

Bug #1748793 reported by Adam Greig
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KiCad
Fix Released
Medium
Unassigned

Bug Description

Occasionally when opening a new dialog window in EESchema, e.g. adding or editing a symbol, and when using the awesome window manager (awesomewm.org), kicad, the window manager, and X all go to 100% CPU usage and hang. The simplest reproduction I've found is to open a schematic, click the Place Symbol tool, then keep clicking Cancel on the dialog and again on the schematic underneath to re-open the dialog. Usually the hang is triggered within 10s of seconds of clicking.

KiCad is constantly hitting this function call:
https://github.com/KiCad/kicad-source-mirror/blob/62ef63501c9a7fb64e1617c4ddec356295995d7e/common/eda_base_frame.cpp#L182
which was introduced in this commit on 9 Jan:
https://github.com/KiCad/kicad-source-mirror/commit/786312b1034c68855b7dc62d5de1525fbb14a20d

It looks like an event occasionally comes in before the window is raised and then KiCad continuously tries to raise the window, but presumably keeps getting new events before that can happen. Removing the call to Raise() prevents the hang without any other apparent side effects. A print statement just before the call to raise gets triggered a lot even when not hitting the hang, but just keeps looping once the hang is encountered.

I've not observed this bug under Unity. I haven't tested under any other tiling window managers. I'm using Ubuntu 16.04, and awesome version 4.2. It's possible this is a bug in awesome or some interaction between awesome and KiCad, but I've not seen any similar behaviour in any other applications. The issue occurs on the nightly KiCad builds in the Ubuntu PPA and in my development builds after commit 786312b10 linked above, including on the current master branch.

Application: eeschema
Version: (2018-02-11 revision 62ef635)-master, debug build
Libraries:
    wxWidgets 3.0.2
    libcurl/7.47.0 GnuTLS/3.4.10 zlib/1.2.8 libidn/1.32 librtmp/2.3
Platform: Linux 4.13.0-32-generic x86_64, 64 bit, Little endian, wxGTK
Build Info:
    wxWidgets: 3.0.2 (wchar_t,wx containers,compatible with 2.8) GTK+ 2.24
    Boost: 1.58.0
    Curl: 7.47.0
    Compiler: GCC 5.4.0 with C++ ABI 1009

Build settings:
    USE_WX_GRAPHICS_CONTEXT=OFF
    USE_WX_OVERLAY=OFF
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_WXPYTHON=ON
    KICAD_SCRIPTING_ACTION_MENU=OFF
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=OFF
    KICAD_SPICE=ON

Tags: awesomewm
Revision history for this message
Jeff Young (jeyjey) wrote :

That's the Mac/OSX dialog-falls-behind-main-window bug fix. Did we have MSW occurrences of that as well, or can we make it Mac-specific?

Revision history for this message
Nick Østergaard (nickoe) wrote :

Could you possibly try to make a screencast of this, maybe use a tool to overlay buttons clicked. It may make it easier for anyone else to reproduce and understand.

Revision history for this message
Nick Østergaard (nickoe) wrote :

Are you by any chance using wayland? Also, I use i3 and I have never noticed this problem. Maybe you can test that if you get the urge?

Revision history for this message
Adam Greig (adamgreig) wrote :

I'm not using Wayland, just plain Xorg 7.7. I'll try and give i3 a go to see if it happens there too.

As for a screencast, I don't think there's much to show: anything that opens a dialogue will do it, like pressing 'e'/'v' on an existing symbol, or clicking to bring up the symbol select dialogue when in place symbol mode, etc. Most of the time the dialogue comes up fine, but occasionally it doesn't appear and instead the system hangs. The repro I mentioned is just a quick way to keep bringing up dialogues: I keep clicking to place a new symbol, which brings up the symbol selection dialog, and if it doesn't hang, I immediately click cancel and then click on the canvas again to bring up the dialog again.

Revision history for this message
Jeff Young (jeyjey) wrote :

Putting the pointer over a symbol and hammering away on 'e' / 'esc' is a quick way to bring up / close dialogs. However, I still can't reproduce the bug (on OSX)....

Revision history for this message
Adam Greig (adamgreig) wrote :

nickoe: I tried using i3 v4.11 and can't reproduce.
jeyjey: For whatever reason I can't repro using 'e'/esc but 'v'/esc does quickly, as well as 'a' and clicking. Maybe they are different types of dialogs? I can still only get this behaviour under awesome.

tags: added: awesomewm
Changed in kicad:
importance: Undecided → Low
Revision history for this message
Maciej Suminski (orsonmmz) wrote :

It might be hard to debug. I use awesome 4.1 and could not reproduce the problem either. Are you sure it is not a matter of your rc.lua config?

Revision history for this message
Adam Greig (adamgreig) wrote :

Yep it's looking like it will be! It happens on a fresh Ubuntu 16.04 install with a stock rc.lua as well as on my main desktop with or without my own rc.lua. I'll see if I can try with older and/or development awesome. It looks like many more events are being received while the quasi-modal dialog is open under awesome than under other WMs, but I'm not sure why.

Revision history for this message
Simon Schubert (corecode) wrote :

I see a more extreme version of this on my system running XMonad - the whole WM locks up and I need to nuke my X session. Bisect found the same commit as problem. XMonad developers described it as a DoS for all WM that use XRestackWindows().

Changed in kicad:
status: New → Confirmed
Revision history for this message
Wayne Stambaugh (stambaughw) wrote :

@Jeff, would you please take a look at this? If I was guessing, I would say the window raise code you added in the ProcessEvent() handler is the issue. Maybe you can make this a one shot event or limit this action to a single event. Without the window manager in question, I have no idea what event that would be. The windows managers in question are probably restacking the windows which creates an infinite event loop.

Revision history for this message
Jeff Young (jeyjey) wrote :

@Wayne, we might just want to make it Mac-specific. (Although didn't Windows have some "dialogs falling behind" issues, or am I mis-remembering that?)

Revision history for this message
Wayne Stambaugh (stambaughw) wrote : Re: [Bug 1748793] Re: Occasional hang with 100% CPU use when opening dialog windows

@Jeff, I don't recall any on windows but possibly linux with it's myriad
of window managers. Although you shouldn't depend on my memory for this
either. :)

On 2/22/2018 12:19 PM, Jeff Young wrote:
> @Wayne, we might just want to make it Mac-specific. (Although didn't
> Windows have some "dialogs falling behind" issues, or am I mis-
> remembering that?)
>

Revision history for this message
Jeff Young (jeyjey) wrote :
Changed in kicad:
importance: Low → Medium
Revision history for this message
Jeff Young (jeyjey) wrote :

Bumped the priority. It might be rare, but it's still a hang.

Revision history for this message
Wayne Stambaugh (stambaughw) wrote :

Patch merged. Thanks.

Revision history for this message
KiCad Janitor (kicad-janitor) wrote :

Fixed in revision bbad8dc9af0a685fd81325386ae75bbece17c1b1
https://git.launchpad.net/kicad/patch/?id=bbad8dc9af0a685fd81325386ae75bbece17c1b1

Changed in kicad:
status: Confirmed → Fix Committed
Changed in kicad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.