kicad crashes again

Bug #1774777 reported by hys
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KiCad
Fix Released
Critical
Jeff Young

Bug Description

open /Library/Application Support/kicad/demos/complex_hierarchy/complex_hierarchy.pro
then close kicad

open kicad again
click on eeschema button
hit 'a' key when eeschema is running
wait a few seconds
then crash

try 4-5 time

Application: kicad
Version: (5.0.0-rc2-55-gc5cde53), release build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.54.0 LibreSSL/2.0.20 zlib/1.2.11 nghttp2/1.24.0
Platform: Mac OS X (Darwin 17.5.0 x86_64), 64 bit, Little endian, wxMac
Build Info:
    wxWidgets: 3.0.4 (UTF-8,STL containers,compatible with 2.8)
    Boost: 1.61.0
    OpenCASCADE Community Edition: 6.8.0
    Curl: 7.43.0
    Compiler: Clang 7.3.0 with C++ ABI 1002

Build settings:
    USE_WX_GRAPHICS_CONTEXT=ON
    USE_WX_OVERLAY=ON
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_WXPYTHON=ON
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=ON

Revision history for this message
hys (szpttboy) wrote :
Revision history for this message
hys (szpttboy) wrote :

similar operation to cause another crash

open kicad
click on eeschema button
hit 'a' key when eeschema is running
close "choose symbol" window when progress bar is growing
close eeschema
then crash

Revision history for this message
hys (szpttboy) wrote :
Revision history for this message
Oivind Toien (otoien) wrote :

I could not reproduce this in the Windows nightly downloaded yesterday, even after multiple tries. Could it be library dependent?

Application: kicad
Version: (5.0.0-rc2-60-gbfa89039c), release build
Libraries:
    wxWidgets 3.0.3
    libcurl/7.54.1 OpenSSL/1.0.2l zlib/1.2.11 libssh2/1.8.0 nghttp2/1.23.1 librtmp/2.3
Platform: Windows 7 (build 7601, Service Pack 1), 64-bit edition, 64 bit, Little endian, wxMSW
Build Info:
    wxWidgets: 3.0.3 (wchar_t,wx containers,compatible with 2.8)
    Boost: 1.60.0
    OpenCASCADE Community Edition: 6.8.0
    Curl: 7.54.1
    Compiler: GCC 7.1.0 with C++ ABI 1011

Build settings:
    USE_WX_GRAPHICS_CONTEXT=OFF
    USE_WX_OVERLAY=OFF
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_WXPYTHON=ON
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=ON

Revision history for this message
KiCad Janitor (kicad-janitor) wrote :

Fixed in revision 975e77714972b2bd17ef0788f0b5872f39bbddbf
https://git.launchpad.net/kicad/patch/?id=975e77714972b2bd17ef0788f0b5872f39bbddbf

Changed in kicad:
status: New → Fix Committed
Revision history for this message
Seth Hillbrand (sethh) wrote :

Looks like the fix didn't get it yet.

Changed in kicad:
status: Fix Committed → In Progress
importance: Undecided → Critical
milestone: none → 5.0.0-rc3
Revision history for this message
Seth Hillbrand (sethh) wrote :
Download full text (10.7 KiB)

Two additional data points captured while trying to recreate. They may/may not be the same issue but definitely threading related.

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x000000000000009c
Exception Note: EXC_CORPSE_NOTIFY

Termination Signal: Segmentation fault: 11
Termination Reason: Namespace SIGNAL, Code 0xb
Terminating Process: exc handler [0]

VM Regions Near 0x9c:
-->
    __TEXT 0000000109c42000-0000000109cb6000 [ 464K] r-x/rwx SM=COW /Applications/Kicad/kicad.app/Contents/Applications/eeschema.app/Contents/MacOS/eeschema

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 _eeschema.kiface 0x000000010eb5ae1c EDA_DRAW_FRAME::GetCrossHairPosition(bool) const + 60
1 _eeschema.kiface 0x000000010eb61fd4 EDA_DRAW_PANEL::MoveCursorToCrossHair() + 20
2 _eeschema.kiface 0x000000010e91b325 SCH_EDIT_FRAME::Load_Component(wxDC*, SCHLIB_FILTER const*, std::__1::vector<SCH_BASE_FRAME::COMPONENT_SELECTION, std::__1::allocator<SCH_BASE_FRAME::COMPONENT_SELECTION> >&, bool) + 901
3 _eeschema.kiface 0x000000010e986962 SCH_EDIT_FRAME::OnLeftClick(wxDC*, wxPoint const&) + 994
4 _eeschema.kiface 0x000000010ea23f50 SCH_EDIT_FRAME::OnSelectTool(wxCommandEvent&) + 5088
5 libwx_baseu-3.0.0.dylib 0x000000010a826e4f wxEventHashTable::HandleEvent(wxEvent&, wxEvtHandler*) + 239
6 libwx_baseu-3.0.0.dylib 0x000000010a828328 wxEvtHandler::ProcessEvent(wxEvent&) + 280
7 _eeschema.kiface 0x000000010eb763e8 EDA_BASE_FRAME::ProcessEvent(wxEvent&) + 88
8 libwx_baseu-3.0.0.dylib 0x000000010a828424 wxEvtHandler::ProcessEventLocally(wxEvent&) + 164
9 libwx_baseu-3.0.0.dylib 0x000000010a8282cb wxEvtHandler::ProcessEvent(wxEvent&) + 187
10 _eeschema.kiface 0x000000010e91f4b0 SCH_EDIT_FRAME::OnHotKey(wxDC*, int, wxPoint const&, EDA_ITEM*) + 560
11 _eeschema.kiface 0x000000010e8eefaf SCH_EDIT_FRAME::GeneralControl(wxDC*, wxPoint const&, unsigned int) + 239
12 _eeschema.kiface 0x000000010eb5ec62 EDA_DRAW_PANEL::OnKeyEvent(wxKeyEvent&) + 1202
13 libwx_baseu-3.0.0.dylib 0x000000010a826e4f wxEventHashTable::HandleEvent(wxEvent&, wxEvtHandler*) + 239
14 libwx_baseu-3.0.0.dylib 0x000000010a8283dd wxEvtHandler::ProcessEventLocally(wxEvent&) + 93
15 libwx_baseu-3.0.0.dylib 0x000000010a8282cb wxEvtHandler::ProcessEvent(wxEvent&) + 187
16 libwx_osx_cocoau_core-3.0.0.dylib 0x000000010a2d80a3 wxScrollHelperEvtHandler::ProcessEvent(wxEvent&) + 35
17 libwx_baseu-3.0.0.dylib 0x000000010a8286df wxEvtHandler::SafelyProcessEvent(wxEvent&) + 15
18 libwx_osx_cocoau_core-3.0.0.dylib 0x000000010a097c74 wxWindow::OSXHandleKeyEvent(wxKeyEvent&) + 292
19 libwx_osx_cocoau_core-3.0.0.dylib 0x000000010a16a645 wxWidgetCocoaImpl::DoHandleKeyEvent(NSEvent*) + 133
20 libwx_osx_cocoau_core-3.0.0.dylib 0x000000010a166c64 wxWidgetCocoaImpl::keyEvent(NSEvent*, NSView*, void*) + 340
21 com.apple.AppKit 0x00007fff2980043d -[NSWind...

Changed in kicad:
assignee: nobody → Seth Hillbrand (sethh)
Revision history for this message
Seth Hillbrand (sethh) wrote :

@hys: Can you post your fp-lib-table file?

Revision history for this message
Seth Hillbrand (sethh) wrote :

I've pushed a couple of guards that were a Good Idea™ for threads. Let's see if these address the bug in tomorrow's nightly.

Revision history for this message
hys (szpttboy) wrote :
Revision history for this message
hys (szpttboy) wrote :

still crashes in 20180605 nightly

Revision history for this message
Seth Hillbrand (sethh) wrote :

@Jeff- Can I get your input here? It appears (though I am not 100%) that the windows are getting re-parented on Mac. When we close Eeschema before the thread completes, it has a false pointer to the main frame and crashes.

I don't fully understand how the re-parenting works and don't have a compiling workstation running Mac to test these theories. If you'd be up for having a swing at this, I'd greatly appreciate it!

Revision history for this message
Jeff Young (jeyjey) wrote :

@Seth, I was never able to reproduce it with the original steps. Your comment makes it look like I should close eeschema while we're loading the library to trigger it?

Revision history for this message
Jeff Young (jeyjey) wrote :

We do re-parent dialogs on the Mac, but only a QuasiModal to its parent document. There's a guard on document close when a QuasiModal is up that "supposedly" protects us from closing it out from under the dialog.

Do you know what window it is that has the stale pointer? Is it a QuasiModal, or a progress reporter or something else?

Revision history for this message
Seth Hillbrand (sethh) wrote :

I can only recreate it with the footprint preview on. Here are the steps:

1) Open Eeschema stand alone app
2) Add a component
3) While the libraries are loading, cancel/close the window
4) Exit Eeschema

At this point, 30s - 1min later, I get a crash reporter notification.

I was suspecting the reparent as the backtrace shows QuasiModal event loop triggering the SCH_EDIT_FRAME destructor. That seemed backwards.

I attempted to prevent this at the splitter tree canvas (that creates a pseudo-window) with a5c213c23. But that was not the issue or not the correct solution to the issue.

Seth Hillbrand (sethh)
Changed in kicad:
assignee: Seth Hillbrand (sethh) → nobody
Revision history for this message
Jeff Young (jeyjey) wrote :

There are several event loops running during the closing of the DIALOG_CHOOSE_COMPONENT: its own event loop which may have only been scheduled for exit rather than actually exited, and even if that is fixed there's the event loop of the progress-reporter from the FOOTPRINT_SELECT_WIDGET.

So I think Seth is on the right track with removing the dialog from its parent so that it can go ahead and die on its own.

Changed in kicad:
assignee: nobody → Jeff Young (jeyjey)
Revision history for this message
hys (szpttboy) wrote :

Seth was right.
i tested again, to recreate it the footprint preview must be on

Revision history for this message
Jeff Young (jeyjey) wrote :

That didn't work.

I can fix it by moving the window closing stuff (and parent window re-enabling) to after the event loop terminates.

But that means I need to figure out how to have a cancel of the dialog also cancel the lazy background footprint loading (or else the dialog stays up until that finishes).

Revision history for this message
Seth Hillbrand (sethh) wrote :

Jeff, nice work, thanks!

There's a cancel flag that the loops check. They'll exit at the next safe point. That will interrupt the full loading and may be sufficiently rapid for cancellation, I think.

Revision history for this message
Jeff Young (jeyjey) wrote :

This turns out to be more complicated than expected. We only check the cancel flag after each library, which still leaves an annoying pause before you can place the selected component (or close the parent document).

I tried allowing the event loop of the dialog to expire on its own, but then we get back to the application's main event loop still processing the end of the add-component command and die because the SCH_BASE_FRAME has been deleted out from under us.

Revision history for this message
Seth Hillbrand (sethh) wrote :

I thought we checked after each footprint. If not, we probably could. We could also skip the list generation / sorting if we cancel. I bet that takes a few ms.

Revision history for this message
Jeff Young (jeyjey) wrote :

The main issue turned out to be that the cancel wasn't working at all. While the footprints are loading the entire dialog is being run from the GAUGE_PROGRESS_REPORTER's onUpdateUI() event loop. Interestingly, the wxDataViewEvents are all considered "UIEvents" and are processed, but needless to say OK, cancel and the other commands that were trying to cancel the lazy loading weren't getting processed in the onUpdateUI() event loop. So cancel only happened when it was finished anyway.

I don't think we're going to be able to do background loading for 5.0. It only really matters the first time you bring up the Component Chooser anyway.

Revision history for this message
KiCad Janitor (kicad-janitor) wrote :

Fixed in revision b9874da3a62bf4d0bae39cb37a796fdb4705517a
https://git.launchpad.net/kicad/patch/?id=b9874da3a62bf4d0bae39cb37a796fdb4705517a

Changed in kicad:
status: In Progress → Fix Committed
Changed in kicad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.