Crash during zone refill in DRC

Bug #1761450 reported by Jeff Young on 2018-04-05
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KiCad
Critical
Tomasz Wlostowski

Bug Description

I'm getting intermittent crashes during various GUI toolkit operations inside WX_PROGRESS_DIALOG's update() while running a Fill Zones inside DRC.

Reproducible roughly 20% of the time. Moving the DRC dialog while filling /may/ increase the chances of crashing, but is not required.

Latest source.

Jeff Young (jeyjey) wrote :

@Devs, does this reproduce on Linux and/or Windows?

description: updated
Jeff Young (jeyjey) wrote :

This is another stack trace that I get (not in the progress dialog's update event):

wxStringIteratorNode::clear() string.h:4287
wxStringIteratorNode::~wxStringIteratorNode() string.h:369
wxStringIteratorNode::~wxStringIteratorNode() string.h:369
wxString::const_iterator::~const_iterator() string.h:969
wxString::const_iterator::~const_iterator() string.h:969
wxString::length() const string.h:1434
wxString::Length() const string.h:2391
TEXTE_PCB::TransformBoundingBoxWithClearanceToPolygon(SHAPE_POLY_SET&, int) const board_items_to_polygon_shape_transform.cpp:408
ZONE_FILLER::buildZoneFeatureHoleList(ZONE_CONTAINER const*, SHAPE_POLY_SET&) const zone_filler.cpp:545
ZONE_FILLER::computeRawFilledAreas(ZONE_CONTAINER const*, SHAPE_POLY_SET const&, SHAPE_POLY_SET&, SHAPE_POLY_SET&) const zone_filler.cpp:704
ZONE_FILLER::fillSingleZone(ZONE_CONTAINER const*, SHAPE_POLY_SET&, SHAPE_POLY_SET&) const zone_filler.cpp:788
ZONE_FILLER::Fill(std::__1::vector<ZONE_CONTAINER*, std::__1::allocator<ZONE_CONTAINER*> >)::$_0::operator()() const zone_filler.cpp:134
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, ZONE_FILLER::Fill(std::__1::vector<ZONE_CONTAINER*, std::__1::allocator<ZONE_CONTAINER*> >)::$_0> >(void*) type_traits:4323
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, ZONE_FILLER::Fill(std::__1::vector<ZONE_CONTAINER*, std::__1::allocator<ZONE_CONTAINER*> >)::$_0> >(void*) thread:342
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, ZONE_FILLER::Fill(std::__1::vector<ZONE_CONTAINER*, std::__1::allocator<ZONE_CONTAINER*> >)::$_0> >(void*) thread:352
_pthread_body 0x00007fff6b0756c1
_pthread_start 0x00007fff6b07556d
thread_start 0x00007fff6b074c5d

Jeff Young (jeyjey) wrote :

It would appear to be the multiple workers, rather than the progress dialog. With only 2 threads (main + one worker) I can't make it crash.

Nick Østergaard (nickoe) wrote :

For history it is useful to paste the version info.

Changed in kicad:
assignee: nobody → Tomasz Wlostowski (twlostow)
Tomasz Wlostowski (twlostow) wrote :

@Jeff, could you send the board showing this crash (and some version info about wx and OS)?

Tom

Jeff Young (jeyjey) wrote :

I've reproduced it on several boards, but I've attached the one I've been using for testing.

Application: kicad
Version: (5.0.0-rc2-dev-377-g9db6b3b96-dirty), debug build
Libraries:
    wxWidgets 3.0.4
    libcurl/7.54.0 LibreSSL/2.0.20 zlib/1.2.11 nghttp2/1.24.0
Platform: Mac OS X (Darwin 17.4.0 x86_64), 64 bit, Little endian, wxMac
Build Info:
    wxWidgets: 3.0.4 (UTF-8,STL containers,compatible with 2.8)
    Boost: 1.66.0
    Curl: 7.57.0
    Compiler: Clang 9.1.0 with C++ ABI 1002

Build settings:
    USE_WX_GRAPHICS_CONTEXT=ON
    USE_WX_OVERLAY=ON
    KICAD_SCRIPTING=ON
    KICAD_SCRIPTING_MODULES=OFF
    KICAD_SCRIPTING_WXPYTHON=OFF
    KICAD_SCRIPTING_ACTION_MENU=OFF
    BUILD_GITHUB_PLUGIN=ON
    KICAD_USE_OCE=OFF
    KICAD_SPICE=ON

Jeff Young (jeyjey) wrote :

I found one thread-safety issue in chamferFilletPolygon() (due to its call to RemoveNullSegments()), but I think we only call that on the "owned" zone from each thread, so it shouldn't be the issue here (and indeed fixing it doesn't fix the bug.)

Jeff Young (jeyjey) wrote :

Ugh. This is scary. I found this in wx/string.h:

#if wxUSE_UNICODE_UTF8
// NB: In UTF-8 build, (non-const) iterator needs to keep reference
// to the underlying wxStringImpl, because UTF-8 is variable-length
// encoding and changing the value pointer to by an iterator (using
// its operator*) requires calling wxStringImpl::replace() if the old
// and new values differ in their encoding's length.
//
// Furthermore, the replace() call may invalid all iterators for the
// string, so we have to keep track of outstanding iterators and update
// them if replace() happens.
//
// This is implemented by maintaining linked list of iterators for every
// string and traversing it in wxUniCharRef::operator=(). Head of the
// list is stored in wxString. (FIXME-UTF8)

Note that it says "invalid[sic] all iterators" (not just const ones). So they /also/ keep const iterators in said list (I confirmed this in gdb). And it would appear that inserting and removing them from the list is not thread-safe.

With the first line of TEXTE_PCB::TransformBoundingBoxWithClearanceToPolygon() as:

   if( GetText().Length() == 0 )

I get a crash about every 3 - 5 fills.

With it as:

   if( GetText().IsEmpty() )

I haven't gotten a crash yet in 50 fills.

Jeff Young (jeyjey) wrote :

Here's a proposed fix which is a bit more elegant and resilient than just replacing Length() with IsEmpty().

BTW: does this reproduce on Linux or Windows?

Tomasz Wlostowski (twlostow) wrote :

I didn't manage to reproduce the problem under Linux, but your solution seems OK. Please push it Jeff!
Tom

Jeff Young (jeyjey) wrote :
Changed in kicad:
status: New → Fix Committed
Changed in kicad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers