PCBnew: Crashes when updating footprints

Bug #1851574 reported by SFEN
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
KiCad
Fix Committed
Critical
Jeff Young

Bug Description

We have a very complex 28 layer board. As part of our tape out procedure, we need to have all footprints updated. However, when we try and update the footprints on this board, Kicad spends time processing, and then seg faults.

This is preventing us from getting the design out the door.

Here's the pcbnew version that we're using:

Application: KiCad
Version: (5.99.0-311-g81ce588a0), release build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.61.1 OpenSSL/1.1.1 zlib/1.2.11 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.4) libssh2/1.8.0 nghttp2/1.33.0
Platform: Linux 4.18.12-arch1-1-ARCH x86_64, 64 bit, Little endian, wxGTK
Build Info:
    Build date: Nov 5 2019 22:09:14
    wxWidgets: 3.0.4 (wchar_t,wx containers,compatible with 2.8) GTK+ 2.24
    Boost: 1.68.0
    OpenCASCADE Community Edition: 6.9.1
    Curl: 7.61.1
    Compiler: GCC 8.2.1 with C++ ABI 1013

Build settings:
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_PYTHON3=OFF
    KICAD_SCRIPTING_WXPYTHON=OFF
    KICAD_SCRIPTING_WXPYTHON_PHOENIX=OFF
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=ON

Tags: pcbnew
Revision history for this message
Rene Poeschl (poeschlr) wrote :

Disclaimer: not a dev.

Are you aware that the version you are using is a development snapshot (a nightly build).
These are generally a bit more likely to have bugs and are also allowed to change their behavior and feature set from one build to the next. The latter could mean that no older version can open a file generated by such a version (for a stable release all releases within that series are meant to be able to open each others files). There is also generally very little documentation available (for things that behave differently compared to the latest stable release) as it would be unreasonable to expect the documentation team to update their docs every day.

Meaning they might not be the best option if your livelihood depends on everything working perfectly all the time.

---

Regarding the bug: could you add a bit more information? Is there any more detail in the error message? Does this only happen if you try to exchange all footprints with one command or also if you update all "same footprints (example all 0603 resistors)"
If it only happens if you run "update all" then maybe one footprint file is damaged in some way. (One could with enough time narrow it down by going through all of them one by one.)
Does the footprint editor report anything wrong when you open it from within your project?

Revision history for this message
SFEN (sfen) wrote :

Here's the dmesg that results;

[2866389.519527] kicad[17148]: segfault at 31 ip 00007f801cc09708 sp 000056299abdee30 error 4 in _eeschema.kiface[7f801c8db000+65c000]
[2866389.519536] Code: d0 48 c1 f8 03 48 8b 44 c2 f8 48 85 c0 74 36 48 8b 80 e0 00 00 00 48 85 c0 74 2a 48 8b a8 00 02 00 00 48 85 ed 74 1e 0f 1f 00 <83> 7d 10 1f 0f 84 7e 00 00 00 48 8b 6d 18 48 85 ed 75 ed 4c 8b 43
[2866389.519612] audit: type=1701 audit(1573067651.902:492): auid=1003 uid=1003 gid=1005 ses=2 pid=17148 comm="kicad" exe="/usr/bin/kicad" sig=11 res=1
[2866389.959456] audit: type=1130 audit(1573067652.342:493): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@7-21561-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[2866397.111826] audit: type=1131 audit(1573067659.495:494): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@7-21561-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[2869799.294829] /dev/vmmon[25252]: PTSC: initialized at 3702103000 Hz using TSC, TSCs are synchronized.
[2871473.781010] kicad[26475]: segfault at 0 ip 00007efd1c8fcd55 sp 00005603d8a99c30 error 6 in libc-2.28.so[7efd1c89c000+14b000]

Unfortunately, we need specific features provided by the development branch, so we can't use the mainline.

Also, the number of unique footprints makes it very hard to validate them individually.

Revision history for this message
Nick Østergaard (nickoe) wrote :

shiqi, could you get a backtrace with gdb instead? It looks like you can just use coredumpctl gdb to get the backtrace from the latest coredump.

Changed in kicad:
importance: Undecided → Critical
milestone: none → 6.0.0-rc1
tags: added: pcbnew
Revision history for this message
SFEN (sfen) wrote :

coredumpctl gdb
           PID: 7440 (kicad)
           UID: 1003 ()
           GID: 1005 ()
        Signal: 11 (SEGV)
     Timestamp: Fri 2019-11-08 16:51:52 UTC (36s ago)
  Command Line: kicad
    Executable: /usr/bin/kicad
 Control Group: /user.slice/user-1003.slice/session-2.scope
          Unit: session-2.scope
         Slice: user-1003.slice
       Session: 2
     Owner UID: 1003 ()
       Boot ID: 00229591d85b4cfaa3f59e505343a9c6
    Machine ID: 9d894abcbcea4c28a284ee101674a44d
      Hostname: strutt
       Storage: /var/lib/systemd/coredump/core.kicad.1003.00229591d85b4cfaa3f59e505343a9c6.7440.1573231912000000.lz4 (truncated)
       Message: Process 7440 (kicad) of user 1003 dumped core.

                Stack trace of thread 7440:
                #0 0x00007f5d441b7d55 n/a (n/a)

GNU gdb (GDB) 8.2
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/kicad...(no debugging symbols found)...done.
BFD: warning: /var/tmp/coredump-Triej1 is truncated: expected core file size >= 3873062912, found: 2147483648
[New LWP 7440]
[New LWP 7443]
[New LWP 7441]
Cannot access memory at address 0x7f5d4541b0e8
Cannot access memory at address 0x7f5d4541b0e0
Failed to read a valid object file image from memory.
Core was generated by `kicad'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f5d441b7d55 in ?? ()
[Current thread is 1 (LWP 7440)]

Revision history for this message
Nick Østergaard (nickoe) wrote :

you need to write bt and press enter to get the backtrace.

Revision history for this message
SFEN (sfen) wrote :

(gdb) bt
#0 0x00007f5d441b7d55 in ?? ()
#1 0xffffffffffffff88 in ?? ()
#2 0x1228773b88905d00 in ?? ()
#3 0x0000403000010001 in ?? ()
#4 0x000055598df758b0 in ?? ()
#5 0x0000555990dd4280 in ?? ()
#6 0x1228773b88905d00 in ?? ()
#7 0x0000555959bb7730 in ?? ()
#8 0x00007f5d2c003240 in ?? ()
#9 0x0000555990dd4280 in ?? ()
#10 0x000055595f347040 in ?? ()
#11 0x0000555959bb7730 in ?? ()
#12 0x0000555959bb7670 in ?? ()
#13 0x000055598df758a0 in ?? ()
#14 0x00007f5d29737e35 in ?? ()
#15 0x0000555990dd4380 in ?? ()
#16 0x00007f5d441baada in ?? ()
#17 0x0000037f00001fa5 in ?? ()
#18 0x0000555990dd4660 in ?? ()
#19 0x0000555996bb9130 in ?? ()
#20 0x000055595f1eee90 in ?? ()
#21 0x0000000000100000 in ?? ()
#22 0x0000555990dd4380 in ?? ()
#23 0x00007f5d2c003240 in ?? ()
#24 0x00007f5d2973a2cd in ?? ()
#25 0x0000555990dd4438 in ?? ()
#26 0x0000555959bb7670 in ?? ()
#27 0x000055598c845600 in ?? ()
#28 0x000055598c845700 in ?? ()
#29 0x0000555900000000 in ?? ()
#30 0x0000555990dd4380 in ?? ()
#31 0x0000555959bb7678 in ?? ()
#32 0x000055595f1eee00 in ?? ()
#33 0x0000555990dd4430 in ?? ()
#34 0x00007f5d2c003200 in ?? ()
#35 0x00007f5d4496b001 in ?? ()
#36 0x0000000000000000 in ?? ()

Revision history for this message
Seth Hillbrand (sethh) wrote :

Unfortunately it looks like the coredump was truncated because it was too large.

If this crash is repeatable, you can launch KiCad from under gdb by running:

gdb kicad

Then, typing 'r' to run the program. When it crashes, type 'bt' to get the backtrace

Revision history for this message
SFEN (sfen) wrote :
Download full text (3.9 KiB)

Thread 1 "kicad" received signal SIGSEGV, Segmentation fault.
0x00007ffff6d95d55 in _int_free () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff6d95d55 in _int_free () at /usr/lib/libc.so.6
#1 0x00007fffd34d2e35 in () at /usr/bin/_pcbnew.kiface
#2 0x00007fffd34d52cd in () at /usr/bin/_pcbnew.kiface
#3 0x00007fffd34d793e in () at /usr/bin/_pcbnew.kiface
#4 0x00007fffd34d7f5c in () at /usr/bin/_pcbnew.kiface
#5 0x00007fffd3198ae0 in () at /usr/bin/_pcbnew.kiface
#6 0x00007fffd2b41820 in () at /usr/bin/_pcbnew.kiface
#7 0x00007ffff74a289e in wxEvtHandler::ProcessEventIfMatchesId(wxEventTableEntryBase const&, wxEvtHandler*, wxEvent&) () at /usr/lib/libwx_baseu-3.0.so.0
#8 0x00007ffff74a2c1b in wxEvtHandler::SearchDynamicEventTable(wxEvent&) () at /usr/lib/libwx_baseu-3.0.so.0
#9 0x00007ffff74a2cb1 in wxEvtHandler::TryHereOnly(wxEvent&) () at /usr/lib/libwx_baseu-3.0.so.0
#10 0x00007ffff74a2d64 in wxEvtHandler::ProcessEventLocally(wxEvent&) () at /usr/lib/libwx_baseu-3.0.so.0
#11 0x00007ffff74a2e02 in wxEvtHandler::ProcessEvent(wxEvent&) () at /usr/lib/libwx_baseu-3.0.so.0
#12 0x00007ffff74a2ba7 in wxEvtHandler::SafelyProcessEvent(wxEvent&) () at /usr/lib/libwx_baseu-3.0.so.0
#13 0x00007ffff7881a19 in () at /usr/lib/libwx_gtk2u_core-3.0.so.0
#14 0x00007ffff63fa3d5 in g_closure_invoke () at /usr/lib/libgobject-2.0.so.0
#15 0x00007ffff63e6c78 in () at /usr/lib/libgobject-2.0.so.0
#16 0x00007ffff63eb01e in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#17 0x00007ffff63eba80 in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#18 0x00007ffff675a895 in () at /usr/lib/libgtk-x11-2.0.so.0
#19 0x00007ffff63fa3d5 in g_closure_invoke () at /usr/lib/libgobject-2.0.so.0
#20 0x00007ffff63e7348 in () at /usr/lib/libgobject-2.0.so.0
#21 0x00007ffff63eb01e in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#22 0x00007ffff63eba80 in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#23 0x00007ffff67597ba in () at /usr/lib/libgtk-x11-2.0.so.0
#24 0x00007ffff68027cc in () at /usr/lib/libgtk-x11-2.0.so.0
#25 0x00007ffff63fa2d2 in g_closure_invoke () at /usr/lib/libgobject-2.0.so.0
#26 0x00007ffff63e699f in () at /usr/lib/libgobject-2.0.so.0
#27 0x00007ffff63ea5ed in g_signal_emit_valist () at /usr/lib/libgobject-2.0.so.0
#28 0x00007ffff63eba80 in g_signal_emit () at /usr/lib/libgobject-2.0.so.0
#29 0x00007ffff691d235 in () at /usr/lib/libgtk-x11-2.0.so.0
#30 0x00007ffff6800a0e in gtk_propagate_event () at /usr/lib/libgtk-x11-2.0.so.0
#31 0x00007ffff6800e43 in gtk_main_do_event () at /usr/lib/libgtk-x11-2.0.so.0
#32 0x00007ffff6479d5e in () at /usr/lib/libgdk-x11-2.0.so.0
#33 0x00007ffff62a33cf in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0
#34 0x00007ffff62a4f89 in () at /usr/lib/libglib-2.0.so.0
#35 0x00007ffff62a5f62 in g_main_loop_run () at /usr/lib/libglib-2.0.so.0
#36 0x00007ffff67ffdf3 in gtk_main () at /usr/lib/libgtk-x11-2.0.so.0
#37 0x00007ffff78231b6 in wxGUIEventLoop::DoRun() () at /usr/lib/libwx_gtk2u_core-3.0.so.0
#38 0x00007ffff736fbae in wxEventLoopBase::Run() () at /usr/lib/libwx_baseu-3.0.so.0
#39 0x00007fffd33efc76 in () at /usr/bin/_pcbnew.kiface
#40 0x00007fffd2e7566c in () at /usr/bin/_pcbn...

Read more...

Revision history for this message
Seth Hillbrand (sethh) wrote :

Unfortunately, you have compiled your own version and it is without function information.

You can recompile using the option `cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo` to include the function names then follow the same procedure to get usable debug information.

Revision history for this message
Victor W (vicw) wrote :
Download full text (9.8 KiB)

Dear Seth,

We've compiled using DCMAKE_BUILD_TYPE=RelWithDebInfo but it doesn't look like I'm getting the additional debug info you're going to need. I'm going to try this again with -DCMAKE_BUILD_TYPE=Debug and see whether that helps matters.

For completeness, here's the updated back trace.

===

GNU gdb (GDB) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kicad...(no debugging symbols found)...done.
(gdb) r
Starting program: /usr/bin/kicad
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[Detaching after vfork from child process 20414]
[Detaching after vfork from child process 20416]
[Detaching after vfork from child process 20418]
[Detaching after vfork from child process 20420]
[Detaching after vfork from child process 20422]
[New Thread 0x7ffff1eae700 (LWP 20424)]
[New Thread 0x7ffff16ad700 (LWP 20425)]
[New Thread 0x7ffff0eac700 (LWP 20426)]
[New Thread 0x7fffdfb06700 (LWP 20427)]
[New Thread 0x7fffc9393700 (LWP 20428)]
[Thread 0x7fffc9393700 (LWP 20428) exited]
[New Thread 0x7fffc8b92700 (LWP 20429)]
[Thread 0x7fffc8b92700 (LWP 20429) exited]
[New Thread 0x7fffc8b92700 (LWP 20430)]
[New Thread 0x7fffc9393700 (LWP 20431)]
[Thread 0x7fffc8b92700 (LWP 20430) exited]
[Thread 0x7fffc9393700 (LWP 20431) exited]
[New Thread 0x7fffc8b92700 (LWP 20432)]
[Thread 0x7fffc8b92700 (LWP 20432) exited]
[New Thread 0x7fffc8b92700 (LWP 20433)]
[New Thread 0x7fffc8b92700 (LWP 20434)]
[Thread 0x7fffc8b92700 (LWP 20433) exited]
[New Thread 0x7fffc8b92700 (LWP 20435)]
[Thread 0x7fffc8b92700 (LWP 20434) exited]
[Thread 0x7fffc8b92700 (LWP 20435) exited]
[New Thread 0x7fffc8b92700 (LWP 20436)]
[New Thread 0x7fffc9393700 (LWP 20437)]
[Thread 0x7fffc8b92700 (LWP 20436) exited]
[New Thread 0x7fffc8b92700 (LWP 20438)]
[Thread 0x7fffc9393700 (LWP 20437) exited]
[New Thread 0x7fffc9393700 (LWP 20439)]
[Thread 0x7fffc8b92700 (LWP 20438) exited]
[New Thread 0x7fffc9393700 (LWP 20440)]
[Thread 0x7fffc9393700 (LWP 20439) exited]
[New Thread 0x7fffc9393700 (LWP 20441)]
[Thread 0x7fffc9393700 (LWP 20440) exited]
[New Thread 0x7fffc9393700 (LWP 20442)]
[Thread 0x7fffc9393700 (LWP 20441) exited]
[New Thread 0x7fffc8b92700 (LWP 20443)]
[Thread 0x7fffc9393700 (LWP 20442) exited]
[Thread 0x7fffc8b92700 (LWP 20443) exited]
[New Thread 0x7fffc8b92700 (LWP 20445)]
[New Thread 0x7fffc9393700 (LWP 20446)]
[New Thread 0x7fffc3fff700 (LWP 20447)]
[New Thread 0x7fffc37fe700 (LWP 20448)]
[New Thread 0x7fffc2ffd700 (LWP 20449)]
[New Thread 0x7fffc27fc700 (LWP...

Revision history for this message
Victor W (vicw) wrote :
Download full text (12.5 KiB)

Hi Seth,

I just compiled a debug build of the very latest version of Kicad:

====

Application: KiCad
Version: (5.99.0-348-gc71202d14), debug build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.64.1 OpenSSL/1.1.1b zlib/1.2.11 libidn2/2.1.1 libpsl/0.20.2 (+libidn2/2.1.1) libssh2/1.8.1 nghttp2/1.36.0
Platform: Linux 5.0.9-arch1-1-ARCH x86_64, 64 bit, Little endian, wxGTK
Build Info:
    Build date: Nov 11 2019 16:48:20
    wxWidgets: 3.0.4 (wchar_t,wx containers,compatible with 2.8) GTK+ 2.24
    Boost: 1.69.0
    OpenCASCADE Community Edition: 6.9.1
    Curl: 7.64.1
    Compiler: GCC 8.3.0 with C++ ABI 1013

Build settings:
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_PYTHON3=OFF
    KICAD_SCRIPTING_WXPYTHON=OFF
    KICAD_SCRIPTING_WXPYTHON_PHOENIX=OFF
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=ON
    KICAD_STDLIB_DEBUG=OFF
    KICAD_STDLIB_LIGHT_DEBUG=OFF
    KICAD_SANITIZE=OFF

====

Opening our project file, then opening PCBnew, the updating all footprints (with all update options checked), yields the following error, which happens around 15min into the update process.

This is the associated back trace from the debug build, but I'm not sure it provides much more insight:

GNU gdb (GDB) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kicad...(no debugging symbols found)...done.
(gdb) r
Starting program: /usr/bin/kicad
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[Detaching after vfork from child process 25086]
[Detaching after vfork from child process 25088]
[Detaching after vfork from child process 25090]
[Detaching after vfork from child process 25092]
[Detaching after vfork from child process 25094]
[New Thread 0x7ffff1eae700 (LWP 25096)]
[New Thread 0x7ffff16ad700 (LWP 25097)]
[New Thread 0x7ffff0eac700 (LWP 25098)]
[New Thread 0x7fffdfb06700 (LWP 25099)]
[Detaching after vfork from child process 25100]
[Detaching after vfork from child process 25103]
[Thread 0x7ffff0eac700 (LWP 25098) exited]
[New Thread 0x7ffff0eac700 (LWP 25109)]
[New Thread 0x7fffc9094700 (LWP 25110)]
[Thread 0x7ffff0eac700 (LWP 25109) exited]
[Thread 0x7fffc9094700 (LWP 25110) exited]
[New Thread 0x7ffff0eac700 (LWP 25111)]
[New Thread 0x7fffc9094700 (LWP 25112)]
[Thread 0x7ffff0eac700 (LWP 25111) exited]
[New Thread 0x7ffff0eac700 (LWP 25113)]
[Thread 0x7fffc9094700 (LWP 25112) exited]
[New Thread 0x7fffc9094700 (LWP 25114)]
[Threa...

Revision history for this message
Seth Hillbrand (sethh) wrote :

Unfortunately, this does not appear to be a full debug build.

When you rebuild with new CMake options, you will need to utilize a new directory for the build. Otherwise, some options do not fully clear. This appears to be the case here (and probably in the RelWithDebInfo build). Unless you have a separate step to strip the files in your build process that is not part of the standard build.

Revision history for this message
Victor W (vicw) wrote :

I'm recompiling now, from an entirely new build directory, and I'll try and post the results.

Revision history for this message
Victor W (vicw) wrote :

Dear Seth,

After updating all the project symbols across all the sheets, using the methodology specified here:

https://bugs.launchpad.net/kicad/+bug/1761234

We are now able to successfully update the footprints. I suspect that the issue either relates to not checking whether a footprint exists, or else an invalid or out of date footprint name, or an incorrect assumption as to what the footprint/filename is.

It appears as though this is due to a bad footprint parameter, as progressively deleting footprints results in pcbnew successfully completing the update process.

We're currently trying to figure out exactly which footprint caused the issue, and why, but it looks as though we incompletely adjusted either the footprint or symbol parameters so that some parts of the design had the old symbol/footprint parameters, and the other half of the design had more up to date ones.

By updating all the symbol/footprint pairs, this issue appears to have resolved itself, and we can now update footprints without causing a crash.

Revision history for this message
Victor W (vicw) wrote :
Download full text (8.9 KiB)

Hi,

Using the debug version of kicad, I obtained the following log - it looks like this one may contain more useful information.

===

Application: KiCad
Version: (5.99.0-360-ga6b94b37e), debug build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.64.1 OpenSSL/1.1.1b zlib/1.2.11 libidn2/2.1.1 libpsl/0.20.2 (+libidn2/2.1.1) libssh2/1.8.1 nghttp2/1.36.0
Platform: Linux 5.0.9-arch1-1-ARCH x86_64, 64 bit, Little endian, wxGTK
Build Info:
    Build date: Nov 12 2019 18:12:47
    wxWidgets: 3.0.4 (wchar_t,wx containers,compatible with 2.8) GTK+ 2.24
    Boost: 1.69.0
    OpenCASCADE Community Edition: 6.9.1
    Curl: 7.64.1
    Compiler: GCC 8.3.0 with C++ ABI 1013

Build settings:
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_PYTHON3=OFF
    KICAD_SCRIPTING_WXPYTHON=OFF
    KICAD_SCRIPTING_WXPYTHON_PHOENIX=OFF
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=ON
    KICAD_STDLIB_DEBUG=OFF
    KICAD_STDLIB_LIGHT_DEBUG=OFF
    KICAD_SANITIZE=OFF

====

double free or corruption (out)

Thread 1 "kicad" received signal SIGABRT, Aborted.
0x00007ffff6e2082f in raise () from /usr/lib/libc.so.6
(gdb)
(gdb)
(gdb)
(gdb) bt
#0 0x00007ffff6e2082f in raise () at /usr/lib/libc.so.6
#1 0x00007ffff6e0b672 in abort () at /usr/lib/libc.so.6
#2 0x00007ffff6e62e78 in __libc_message () at /usr/lib/libc.so.6
#3 0x00007ffff6e6978a in () at /usr/lib/libc.so.6
#4 0x00007ffff6e6b160 in _int_free () at /usr/lib/libc.so.6
#5 0x0000555555739591 in std::default_delete<char []>::operator()<char>(char*) const (this=0x555590682f20, __ptr=0x555578f8a670 "%") at /usr/include/c++/8.3.0/bits/unique_ptr.h:115
#6 0x0000555555736481 in std::unique_ptr<char [], std::default_delete<char []> >::~unique_ptr() (this=0x555590682f20, __in_chrg=<optimized out>)
    at /usr/include/c++/8.3.0/bits/unique_ptr.h:533
#7 0x00007fffc1a66878 in COROUTINE<int, TOOL_EVENT const&>::~COROUTINE() (this=0x555590682f20, __in_chrg=<optimized out>)
    at /usr/src/pacman/kicad-git/src/kicad-git/include/tool/coroutine.h:159
#8 0x00007fffc1a65ab0 in TOOL_MANAGER::TOOL_STATE::Pop() (this=0x55555ba14810) at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:160
#9 0x00007fffc1a63e20 in TOOL_MANAGER::finishTool(TOOL_MANAGER::TOOL_STATE*) (this=0x55555f489140, aState=0x55555ba14810)
    at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:822
#10 0x00007fffc1a62a4d in TOOL_MANAGER::dispatchInternal(TOOL_EVENT const&) (this=0x55555f489140, aEvent=...)
    at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:668
#11 0x00007fffc1a6490c in TOOL_MANAGER::processEvent(TOOL_EVENT const&) (this=0x55555f489140, aEvent=...) at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:1011
#12 0x00007fffc1a60d93 in TOOL_MANAGER::RunAction(TOOL_ACTION const&, bool, void*) (this=0x55555f489140, aAction=..., aNow=true, aParam=0x55559155b640)
    at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:314
#13 0x00007fffc12ccbb3 in TOOL_MANAGER::RunAction<std::vector<BOARD_ITEM*, std::allocator<BOARD_ITEM*> >*>(TOOL_ACTION const&, bool, std...

Read more...

Revision history for this message
Jeff Young (jeyjey) wrote :

After updating all the footprints it applies the commit. During the apply it collects all the deleted items so that it can remove them from the selection. It does the remove by running the deselect action. After dispatching that action, the tool dispatcher attempts to delete the coroutine. It's during the coroutine's destructor that it dies (in std::unique_ptr's destructor).

This sounds like memory corruption to me. We could short-circuit the action (and just call the SelectionTool directly), but that might just delay the memory corruption tripping us up (if that's really what's happening, and the memory corruption isn't itself somehow related to running the deselection through an action).

Hmm.... I wonder if deselecting the objects *after* removing them from the view is the issue? We used to do it before, but evidently that caused issues with screen artefacts left behind while routing. Maybe we need to add a second loop which goes through all the items in the commit list first collecting up the deleted items, deselects them, and then goes through the existing loop to remove them?

Revision history for this message
Jeff Young (jeyjey) wrote :

Or maybe just lock the GAL canvas at the start of the commit and unlock at the end?

@Seth, were the flicking and/or screen garbage bugs logged, or do you remember any more of the details? (See 491098af358a02fc70f44eabda12d562b1f2593e.)

Revision history for this message
Seth Hillbrand (sethh) wrote :

@Jeff-

If I recall, the flicker bug was in the routing and only visible on MSW. I think there was a code comment somewhere at one point. Loop removal maybe?

We always unselected after removing from the view but we did unselect before removing from the board.

Revision history for this message
Jeff Young (jeyjey) wrote :

@Seth, I doubt the removal is the problem then. Removing from the board is pretty much a non-event as far as the data structures are concerned.

@Victor, if it's possible to isolate it to a single (or a few) footprints, that would be great.

Revision history for this message
Victor W (vicw) wrote :

Hi,

We spent the best part of yesterday seeing whether we could isolate it to some footprints, and I think we have. However, they all seem fairly simple. Moreover, we thought we had previously isolated it to a certain footprint, but when we went back and removed that footprint, it still caused an error.

Last night, before I left, I deleted the footprint files of the most used SMD footprints - the 0201/0402/0603 lands that make up over 90% of footprints used on our board.

When I came in this morning, it seemed to have worked. This is a bit unusual because all these footprints appear to be relatively standard. I'm working on further bisecting if this is caused by a specific footprint, and, if so, which one.

Having said that, we're pretty sure that there have been a couple of occasions in which we had those footprints in our design, and it still worked: it just so happened that it worked once, immediately after we updated all the schematic symbols.

I'm not at all sure whether this is possible, but could this issue be caused by the quantity of footprints? I noticed that that when committing the change, you're also creating an undo entry;

#14 0x00007fffc1725102 in BOARD_COMMIT::Push(wxString const&, bool, bool) (this=0x55559155cfc8, aMessage=..., aCreateUndoEntry=true, aSetDirtyBit=true)

We're not able to use pcbnew when selecting or moving large, complex (lots of traces across many layers, with lots of vias and components), without it becoming ridiculously slow for around 30-300s (depending on how much we're selecting and what we're trying to do). If you try to stop or undo the process it sometimes (infrequently) just crashes - you're better off waiting the 5min for it to finish. Speaking very, very, qualitatively, Kicad is similarly slow when updating footprints; it usually takes us around 10-30 minutes.

Is it somehow possible that the cause of this problem isn't due to any specific footprint, but rather the number of them?

I've gotten the number of candidate footprints down to 9 - I'll continue bisecting them and I'll let you know what I find out.

Revision history for this message
Victor W (vicw) wrote :
Download full text (4.0 KiB)

Hi,

I've narrowed this problem down to two generic footprints:

        deleted: C-0201-NoSilk.pretty/C_0201_NoSilk.kicad_mod
        deleted: C-0402-NoSilk.pretty/C_0402_NoSilk.kicad_mod

As you might guess, these are incredibly generic footprints for all the 0402 and 0201 caps on the board. These footprint files are common across all our designs - I can promise you that nothing is wrong with them. I'm attaching the 0201 footprint as an example.

I'm also including another core dump of what happens if those footprints are included.

I'm pretty sure whatever is causing this is due to the number of footprints, and not what they are actually comprised of.

I'm not sure whether this is relevant but kicads memory usage is keeps getting bigger right before it segfaults.

#0 0x00007ffff6e2082f in raise () at /usr/lib/libc.so.6
#1 0x00007ffff6e0b672 in abort () at /usr/lib/libc.so.6
#2 0x00007ffff6e62e78 in __libc_message () at /usr/lib/libc.so.6
#3 0x00007ffff6e6978a in () at /usr/lib/libc.so.6
#4 0x00007ffff6e6b160 in _int_free () at /usr/lib/libc.so.6
#5 0x0000555555739591 in std::default_delete<char []>::operator()<char>(char*) const (this=0x55556060e5b0, __ptr=0x55559993c130 "%") at /usr/include/c++/8.3.0/bits/unique_ptr.h:115
#6 0x0000555555736481 in std::unique_ptr<char [], std::default_delete<char []> >::~unique_ptr() (this=0x55556060e5b0, __in_chrg=<optimized out>) at /usr/include/c++/8.3.0/bits/unique_ptr.h:533
#7 0x00007fffde46e878 in COROUTINE<int, TOOL_EVENT const&>::~COROUTINE() (this=0x55556060e5b0, __in_chrg=<optimized out>) at /usr/src/pacman/kicad-git/src/kicad-git/include/tool/coroutine.h:159
#8 0x00007fffde46dab0 in TOOL_MANAGER::TOOL_STATE::Pop() (this=0x555589b78fa0) at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:160
#9 0x00007fffde46be20 in TOOL_MANAGER::finishTool(TOOL_MANAGER::TOOL_STATE*) (this=0x555593caf430, aState=0x555589b78fa0) at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:822
#10 0x00007fffde46aa4d in TOOL_MANAGER::dispatchInternal(TOOL_EVENT const&) (this=0x555593caf430, aEvent=...) at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:668
#11 0x00007fffde46c90c in TOOL_MANAGER::processEvent(TOOL_EVENT const&) (this=0x555593caf430, aEvent=...) at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:1011
#12 0x00007fffde468d93 in TOOL_MANAGER::RunAction(TOOL_ACTION const&, bool, void*) (this=0x555593caf430, aAction=..., aNow=true, aParam=0x55558d23d5a0) at /usr/src/pacman/kicad-git/src/kicad-git/common/tool/tool_manager.cpp:314
#13 0x00007fffddcd4bb3 in TOOL_MANAGER::RunAction<std::vector<BOARD_ITEM*, std::allocator<BOARD_ITEM*> >*>(TOOL_ACTION const&, bool, std::vector<BOARD_ITEM*, std::allocator<BOARD_ITEM*> >*) (this=0x555593caf430, aAction=..., aNow=true, aParam=0x55558d23d5a0)
    at /usr/src/pacman/kicad-git/src/kicad-git/include/tool/tool_manager.h:139
#14 0x00007fffde12d102 in BOARD_COMMIT::Push(wxString const&, bool, bool) (this=0x55558d23ef28, aMessage=..., aCreateUndoEntry=true, aSetDirtyBit=true) at /usr/src/pacman/kicad-git/src/kicad-git/pcbnew/board_commit.cpp:274
#15 0x00007fffdd9a675e in DIALOG_EXCHAN...

Read more...

Revision history for this message
Jeff Young (jeyjey) wrote :

I pushed a change which defers some of the selection event processing to later. I don't think this was causing this crash, but it'd be great if someone can test it with the new bits just to be sure.

Revision history for this message
Jeff Young (jeyjey) wrote :

I pushed another change that should make it something like 100 times faster for the test case.

Revision history for this message
Victor W (vicw) wrote :

Hi,

I just tried this using; the current git version and it seems to work amazingly: not only does it not crash anymore, but it takes less than 20s, where as before it would routinely take 30-45min. It's super fast.

Here's the specific build I used:

Application: Pcbnew
Version: (5.99.0-398-g3be1862b0), release build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.64.1 OpenSSL/1.1.1b zlib/1.2.11 libidn2/2.1.1 libpsl/0.20.2 (+libidn2/2.1.1) libssh2/1.8.1 nghttp2/1.36.0
Platform: Linux 5.0.9-arch1-1-ARCH x86_64, 64 bit, Little endian, wxGTK
Build Info:
    Build date: Nov 20 2019 18:09:13
    wxWidgets: 3.0.4 (wchar_t,wx containers,compatible with 2.8) GTK+ 2.24
    Boost: 1.69.0
    OpenCASCADE Community Edition: 6.9.1
    Curl: 7.64.1
    Compiler: GCC 8.3.0 with C++ ABI 1013

Build settings:
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_PYTHON3=OFF
    KICAD_SCRIPTING_WXPYTHON=OFF
    KICAD_SCRIPTING_WXPYTHON_PHOENIX=OFF
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=ON

Revision history for this message
Jeff Young (jeyjey) wrote :

If that fixed the crash then I suspect it's wxWidgets' HTML parser that was corrupting memory (or perhaps just leaking enough that the process ran out of memory). We were re-parsing the entire HTML log each time we added an entry. Order n^2 over thousands of items is rarely a recipe for success.

I'm going to close this for now. If anyone sees it again, please re-open.

Changed in kicad:
status: New → Fix Committed
assignee: nobody → Jeff Young (jeyjey)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.