pcbnew: crashes with a failed assertion on i386, starts fine on amd64

Bug #1774316 reported by tijuca
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KiCad
Fix Released
Critical
Unassigned

Bug Description

I got a bug report on Debian for KiCad 5.0.0 RC1 some weeks ago which now has been confirmed for the recently uploaded RC2 too. The issue obviously only exists on i386. I can't really double check this as I don't use any i386 setup anywhere. The issue is happen if Pcbnew is started. OTOH I can't really understand the message and see what's going wrong here.

pcbnew: /build/kicad-9PGPiJ/kicad-5.0.0~rc1+dfsg1+20180318/include/geometry/rtree.h:1642: void RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::Classify(int, int, RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::PartitionVars*) [with DATATYPE = KIGFX::VIEW_ITEM*; ELEMTYPE = int; int NUMDIMS = 2; ELEMTYPEREAL = float; int TMAXNODES = 8; int TMINNODES = 4]: Assertion `!a_parVars->m_taken[a_index]' failed.

https://bugs.debian.org/896706

The build information looks like follows (the platform is i386 then instead, should make no difference here as all other versions are the same for i386 on experimental)

Application: kicad
Version: 5.0.0-rc2+dfsg1-1, release build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.58.0 OpenSSL/1.0.2o zlib/1.2.11 libidn2/2.0.4 libpsl/0.20.1 (+libidn2/2.0.4) libssh2/1.8.0 nghttp2/1.31.1 librtmp/2.3
Platform: Linux 4.16.0-1-amd64 x86_64, 64 bit, Little endian, wxGTK
Build Info:
    wxWidgets: 3.0.4 (wchar_t,wx containers,compatible with 2.8) GTK+ 2.24
    Boost: 1.62.0
    OpenCASCADE Community Edition: 6.9.1
    Curl: 7.60.0
    Compiler: GCC 7.3.0 with C++ ABI 1011
Build settings:
    USE_WX_GRAPHICS_CONTEXT=OFF
    USE_WX_OVERLAY=OFF
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_WXPYTHON=OFF
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=OFF

I'm happy to contribute some more details if I know that's needed.

Thanks Carsten

Tags: pcbnew
Changed in kicad:
milestone: none → 5.0.0-rc3
tags: added: pcbnew
Changed in kicad:
importance: Undecided → Critical
Revision history for this message
Maciej Suminski (orsonmmz) wrote :

Hi tijuca,

I have just installed buster/sid in a 32-bit virtual machine and compiled the current master (bfa89039) - runs without issues. Granted, I have built it with a bit different options then the ones in the original report (disabled scripting and OCE), but I doubt that these options could affect the assert. I am about to rebuild KiCad with the same options now to be sure.

Revision history for this message
Seth Hillbrand (sethh) wrote :

Hi Carsten-

I set up a Buster i386 VM and I see the crash you report when I install the KiCad package from the experimental repository.

However, when I build KiCad myself, I don't observe the crash. I note that your dependencies include libwxgtk3 but the build info says GTK 2.24. I wonder if that might be related?

Here's my build info:

Application: kicad
Version: (5.0.0-rc2-58-gfc71fc647), debug build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.60.0 OpenSSL/1.1.0h zlib/1.2.11 libidn2/2.0.4 libpsl/0.20.1 (+libidn2/2.0.4) libssh2/1.8.0 nghttp2/1.32.0 librtmp/2.3
Platform: Linux 4.16.0-1-686-pae i686, 32 bit, Little endian, wxGTK
Build Info:
    wxWidgets: 3.0.4 (wchar_t,wx containers,compatible with 2.8) GTK+ 2.24
    Boost: 1.62.0
    OpenCASCADE Community Edition: 6.9.1
    Curl: 7.60.0
    Compiler: GCC 7.3.0 with C++ ABI 1011

Build settings:
    USE_WX_GRAPHICS_CONTEXT=OFF
    USE_WX_OVERLAY=OFF
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=ON
    KICAD_SCRIPTING_WXPYTHON=OFF
    KICAD_SCRIPTING_ACTION_MENU=ON
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=ON
    KICAD_USE_OCC=OFF
    KICAD_SPICE=OFF

Revision history for this message
Maciej Suminski (orsonmmz) wrote :

GTK3 is a source of problems in KiCad. If you have KICAD_SCRIPTING_WXPYTHON disabled, then I recommend using libwxgtk3.0-dev (based on GTK2).

Revision history for this message
tijuca (c-schoenert) wrote :

@Maciej
The build of KiCad is using exact this library 'libwxgtk3.0-dev' due the GTK3+ issues we discovered. The used build dependencies can be viewed within the file debian/control. But I guess this should be fine now (again).

https://salsa.debian.org/electronics-team/KiCad/kicad/blob/debian/sid/debian/control#L9

I requested from the reporter to provide a GDB log if possible, this should show some more details what the root of the crash is.

@Seth
I will try to get time to build a KVM machine as i386 with unstable over the weekend, to see by my myself were we are really standing. It's difficult as can't test really all provided platforms. But well, it's raining here. :)

Revision history for this message
tijuca (c-schoenert) wrote :

I found a older laptop which could use to install a Debian testing i386 and I can reproduce the crash of pcbnew here.

I did a GDB log with a full backtrace which is attached to this answer.

The environment is simply a really clean and fresh KiCad installation, no existing projects and local settings, all based from a shiny new installation.

Revision history for this message
Jeff Young (jeyjey) wrote :

It's asserting "!a_parVars->m_taken[a_index]" at:

RTree::Classify (this=0x35ec500, a_index=0, a_group=1, a_parVars=0xbfffcab0) at ./include/geometry/rtree.h:1642

Revision history for this message
Tomasz Wlostowski (twlostow) wrote :

How much RAM does this machine have? Isn't it running out of memory?

Tom

Revision history for this message
tijuca (c-schoenert) wrote :

This laptop has 2GB and about 60% was in use while I started KiCad, I know this because I had opened some system monitoring for other reason. I'd be surprised if about 800MB would have been used on a empty project. On my amd64 based other hardware there are about 200MB used if I open a new project. And there is a also some swap space available (4GB), the application would be much slower than but didn't allowed to crash.

Revision history for this message
Fabián Inostroza (fabianinostroza) wrote :

I don't know if this is related or not, I was stepping in include/geometry/rtree.h:1602 and noticed that the values stored in the area array were different in amd64 and i386, even though the a_parVars->m_branchBuf[].m_rect had the same values.
So I wrote a program that made the same calculations that the CalcRectVolume method, ran the program in the two architectures and the program gave different outputs in the different architectures.

These are the outputs (notice last two values, "area" and his hex representation):
i386:
1073741824.000000, 2305843009213693952.000000, 1518500224.000000, 7244020209416142848.000000, 0x5EC90FDC

amd64:
1073741824.000000, 2305843009213693952.000000, 1518500224.000000, 7244019659660328960.000000, 0x5EC90FDB

Also the call to CalcRectVolume in rtree:1621 returned nan.

Both machines are running debian testing.

Revision history for this message
Maciej Suminski (orsonmmz) wrote :

Is not the problem related to the build process? There are two cases (Seth & mine) when a self-build binary works fine, but the one from the repository crashes.

@tijuca: could you try to build KiCad on your laptop to verify it?

Revision history for this message
jean-pierre charras (jp-charras) wrote :

About this test program (on W7 32 bits/msys2):
* compiling with -O0 the result is 0x5EC90FDC (same as i386)
* compiling with -O2 the result is 0x5EC90FDB (same as amd64)

Perhaps there is an issue depending on the optimization level.
When using double instead of float, I do not have a difference between -O0 and -02, but, as expected, values slightly differ between float and double.

Revision history for this message
Tomasz Wlostowski (twlostow) wrote :

@tijuca: what are exactly the compiler flags the rtree files are compiled with? Do they by chance include "-ffast-math" or something similar?

Tom

Revision history for this message
tijuca (c-schoenert) wrote :

Hi Tom,

the full build log is visible here:
https://buildd.debian.org/status/fetch.php?pkg=kicad&arch=all&ver=5.0.0%7Erc2%2Bdfsg1-2&stamp=1527761908&raw=0

Debian typically adds hardening flags to improve "Stack protection", "Fortify Source functions" and "Read-only relocations". This should have an impact here.

https://wiki.debian.org/HardeningWalkthrough
https://salsa.debian.org/electronics-team/KiCad/kicad/blob/debian/sid/debian/rules#L5

Yes of course I can add additional compiler flags by using 'export DEB_CFLAGS_MAINT_APPEND = -foo' in the debian/rules files. Should this flag added globally or only architectures or only for i386 builds? Today it's to late to build and test anything, hopefully tomorrow I'm able to check this.

Revision history for this message
Fabián Inostroza (fabianinostroza) wrote :

Hi, I compiled kicad passing -DCMAKE_CXX_FLAGS=-mfpmath=387 to cmake in amd64 and the numeric result of the CalcRectVolume and the test program were the same as on i386, but there were no assert.

So it is highly possible that this is not related.

Revision history for this message
Seth Hillbrand (sethh) wrote :

Hi Carsten-

Can you apply this patch to master and test? On my i386 Buster VM, this resolves the issue.

The problem appears to be an index collision when splitting the tree for very large volumes when the tree only contains those volumes. I see at least two ways to avoid this. The least invasive is the attached patch that sets an expected BBox volume for the origin_viewitem class.

The alternative fix would be to use doubles instead of floats for the RTree volume calculations. This would be more invasive as we change the RTree for everything else but it would also be a more robust fix.

Revision history for this message
tijuca (c-schoenert) wrote :

Hello Seth,

I'd say bingo! I first started to apply your patch on top of RC2, instead on current master (which are the base for the Debian packages right now) and it works. Should work on top master than also I guess.
I've rebuild the packages for i386 completely and installed them on the i386 machine and I can open Pcbnew without a crash or issues no matter I start it as stand-alone or via the kicad app launcher.

I also did checked this the same way for amd64, also here the application is working as before. Looks good! Great!

Revision history for this message
Seth Hillbrand (sethh) wrote :

Thanks for the report. I'll push the small patch here and make a note to revisit the RTree volume size calcs during v6.

Revision history for this message
KiCad Janitor (kicad-janitor) wrote :

Fixed in revision 7d62f14dd0745e79828ed1ce0cfcf2b0a4890c60
https://git.launchpad.net/kicad/patch/?id=7d62f14dd0745e79828ed1ce0cfcf2b0a4890c60

Changed in kicad:
status: New → Fix Committed
Changed in kicad:
status: Fix Committed → Fix Released
Revision history for this message
antoha-mi (antoha-mi) wrote :
Download full text (6.3 KiB)

At startup, pcbnew crash with a message:

pcbnew: /build/kicad-9PGPiJ/kicad-5.0.0~rc1+dfsg1+20180318/include/geometry/rtree.h:1642: void RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::Classify(int, int, RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::PartitionVars*) [with DATATYPE = KIGFX::VIEW_ITEM*; ELEMTYPE = int; int NUMDIMS = 2; ELEMTYPEREAL = float; int TMAXNODES = 8; int TMINNODES = 4]: Assertion `!a_parVars->m_taken[a_index]' failed.

Program received signal SIGABRT, Aborted.
__GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 }
(gdb) backtrace
#0 __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0xb7292c81 in __GI_abort () at abort.c:79
#2 0xb72899ba in __assert_fail_base (
    fmt=0xb3330670 "%s%s%s:%u: %s%sПроверочное утверждение «%s» не выполнено.\n%n",
    assertion=0xb4c7f961 "!a_parVars->m_taken[a_index]",
    file=0xb4c7d934 "/usr/src/RPM/BUILD/kicad-5.0.0/include/geometry/rtree.h",
    line=1643,
    function=0xb4ca1360 <RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::Classify(int, int, RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::PartitionVars*)::__PRETTY_FUNCTION__> "void RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::Classify(int, int, RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::PartitionVars*) [with DATATYPE ="...) at assert.c:92
#3 0xb7289a19 in __GI___assert_fail (
    assertion=0xb4c7f961 "!a_parVars->m_taken[a_index]",
    file=0xb4c7d934 "/usr/src/RPM/BUILD/kicad-5.0.0/include/geometry/rtree.h",
    line=1643,
    function=0xb4ca1360 <RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::Classify(int, int, RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::PartitionVars*)::__PRETTY_FUNCTION__> "void RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::Classify(int, int, RTree<DATATYPE, ELEMTYPE, NUMDIMS, ELEMTYPEREAL, TMAXNODES, TMINNODES>::PartitionVars*) [with DATATYPE ="...) at assert.c:101
#4 0xb49e30a0 in RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::Classify (
---Type <return> to continue, or q <return> to quit---
    this=0x9d09fe0, a_index=0, a_group=1, a_parVars=0xbfffde14)
    at /usr/src/debug/kicad-5.0.0/include/geometry/rtree.h:1643
#5 0xb49e3267 in RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::PickSeeds (
    this=0x9d09fe0, a_parVars=<optimized out>)
    at /usr/src/debug/kicad-5.0.0/include/geometry/rtree.h:1634
#6 0xb49e32f2 in RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::ChoosePartition (this=0x9d09fe0, a_parVars=0xbfffde14, a_minFill=4)
    at /usr/src/debug/kicad-5.0.0/include/geometry/rtree.h:1485
#7 0xb49e370b in RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::SplitNode (
    this=0x9d09fe0, a_node=<optimized out>, a_branch=0xbfffdfec,
    a_newNode=0xbfffe068)
    at /usr/src/debug/kicad-5.0.0/include/geometry/rtree.h:1356
#8 0xb49e38cf in RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::AddBranch (
    this=0x9d09fe0, a_branch=0xbfffdfec, a_node=0xb18a680,
    a_newNode=0xbfffe068)
    at /usr/src/debug/kicad-5.0.0/include/geometry/rtree.h:1253
#9 0xb49e3a0f in RTree<KIGFX::VIEW_ITEM*, int, 2, float, 8, 4>::InsertRectRec
    (this=0x9d09fe0...

Read more...

Revision history for this message
Seth Hillbrand (sethh) wrote :

@antoha-mi Please open a new bug report as this one doesn't relate to your issue.

However, you are building from source using GTK3. From your build log:

  wxWidgets library has been built against GTK3, it causes a lot of problems
  in KiCad

You will need to use wxWidgets built against GTK2 to compile KiCad v5 for now.

Revision history for this message
Maciej Suminski (orsonmmz) wrote :

I think the only issue here is that antoha-mi uses rc1 and this bug has been fixed *after* rc2. Please update to rc3 and check again.

Revision history for this message
Seth Hillbrand (sethh) wrote :

I'm not certain about that. There's something odd with this bug report. The top part appears to be copied from the original i686 report (has the Debian build ID in the file directory) but the stack trace is from a different build that appears to match the x86_64 RPM build from their build log.

Revision history for this message
antoha-mi (antoha-mi) wrote :
Revision history for this message
Maciej Suminski (orsonmmz) wrote :

Right, let's clarify this: antoha-mi, can you copy the version information from the About dialog in eeschema (menu Help->About KiCad, click Copy Version Info) and paste it here?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.