Random-seeming crashes in trunk
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
widelands |
Fix Released
|
Critical
|
Unassigned |
Bug Description
We have been getting crashes recently while testing branches that don't seem to be related to the branches. It must be some form of memory problem, maybe in the graphics system.
Crash attached to this post reported in https:/
Related branches
- Klaus Halfmann: Approve (copilles, test, codereview)
-
Diff: 69 lines (+7/-7)3 files modifiedapisrc/graphic/animation.cc (+4/-4)
src/graphic/animation.h (+1/-1)
src/graphic/playercolor.cc (+2/-2)
GunChleoc (gunchleoc) wrote : | #1 |
GunChleoc (gunchleoc) wrote : | #2 |
GunChleoc (gunchleoc) wrote : | #3 |
- crash3.log Edit (3.4 KiB, text/plain)
Crash reported in https:/
This one seems somewhat different.
kaputtnik (franku) wrote : | #4 |
Should we add each crash here? Or only particular ones?
kaputtnik (franku) wrote : | #5 |
- backtrace.txt Edit (7.9 KiB, text/plain)
A crash with branch fh1-multitexture.
Like the other crashes mentioned in the merge proposal i just did nothing. Ran an old save game, played a bit with it and let it run while doing other stuff (e.g. playing other games).
Don't know where the 8 lines containing just '(gdb)' came from.
The console output shows always that the crashes appears after an autosave, but i am not sure if they are related to autosave, because i don't know how much time has gone between autosave and crash.
kaputtnik (franku) wrote : | #6 |
Klaus Halfmann (klaus-halfmann) wrote : | #7 |
Checked "man malloc" once again, there I find:
> The following environment variables change the behavior of the allocation-related functions.
I will try these first:
MallocGuardEdges=y, MallocStackLogg
MallocCheckHeap
See also "leaks" (XCode tool), malloc_history.
gnu/Ubuntu should have similar tools.
Klaus Halfmann (klaus-halfmann) wrote : | #8 |
Had to go to MallocCheckHeap
Now crashed at
load_image_
MallocScribble -> fill memory that has been allocated with 0xaa bytes. This increases the likelihood that a program making assumptions about the contents of freshly allocated memory will fail. Also if set, fill memory that has been deallo-cated with 0x55 bytes. This increases the likelihood that a program will fail due to accessing memory that is no longer allocated.
KERN_INVALID_
malloc_zone_check -> so the check failed as the malloc structures where broken :-(
SirVer (sirver) wrote : | #9 |
crash1, 2 and 3 all seems different things to me.
The crashes in #5 and #6 seem to be the same, but also different to the first three.
The absolute best approaches I know off to figuring out where these memory violations come from is investigating using MSAN [1] and ASAN [2]. This requires building widelands and maybe its dependencies with these settings turned on in the compiler. Other approaches are:
- Mac OS ships with some memory debugging tools. [3]
- A simple library that replace malloc and gives some memory feedback is electric fence [4]
- valgrind's memcheck is also an excellent, but slow tool [5]
[1] https:/
[2] https:/
[3] https:/
[4] https:/
[5] http://
GunChleoc (gunchleoc) wrote : | #10 |
I just got another one, this one definitely from the font renderer. I was working on converting some markup, so the C++ code is all the current trunk version. I guess the next step is to use unique_ptr for the RenderNodes after we have the multitexture branch finished - I had looked into it there and decided against it in order to keep the diff a bit smallerer. Also, the change will be non-trivial.
Thread 1 "widelands" received signal SIGSEGV, Segmentation fault.
0x0000000000dede0f in RT::DivTagRende
at /home/bratzbert
688 delete n;
(gdb) backtrace
#0 0x0000000000dede0f in RT::DivTagRende
at /home/bratzbert
#1 0x0000000000dede9e in RT::DivTagRende
at /home/bratzbert
#2 0x0000000000df87ea in std::default_
at /usr/include/
#3 0x0000000000df5e19 in std::unique_
__in_
#4 0x0000000000dec348 in RT::Renderer:
text=
at /home/bratzbert
#5 0x0000000000dbd053 in (anonymous namespace)
at /home/bratzbert
#6 0x0000000000dbcf5e in (anonymous namespace)
Klaus Halfmann (klaus-halfmann) wrote : | #11 |
Found #1690649 while tetsing, not sure if is related, will continue wihtout sound and autosave,
but MallocCheckHeap
Klaus Halfmann (klaus-halfmann) wrote : | #12 |
Thats getting worse: I now used
export MallocCheckHeap
export MallocCheckHeap
./widelands --verbose --coredum=true --fullscreen=false --xres=1024 --yres=768
when quitting via the keyboard only this works, when using the mouse
(or perhpas using some graphics) It crashes when quitting:
Thread 0 Crashed:: Dispatch queue: com.apple.
0 libsystem_
1 libsystem_
2 libsystem_
3 libsystem_
4 com.apple.
5 libGPUSupportMe
6 libGFXShared.dylib 0x00007fff97c3fe3a gfxDestroyPlugi
7 GLEngine 0x00007fff98908801 gleFreeTextureO
8 libGFXShared.dylib 0x00007fff97c415ca gfxReleaseShare
9 GLEngine 0x00007fff987d9621 gliDestroyContext + 175
10 com.apple.opengl 0x00007fff987b3232 CGLReleaseContext + 187
11 com.apple.AppKit 0x00007fff913e5551 -[NSOpenGLContext dealloc] + 58
12 libSDL2-2.0.0.dylib 0x00000001098910fd Cocoa_GL_
13 widelands 0x0000000107e74d84 Graphic::~Graphic() + 1220 (graphic.cc:125)
14 widelands 0x0000000107e751a5 Graphic::~Graphic() + 21 (graphic.cc:128)
15 widelands 0x0000000107d6332d WLApplication:
16 widelands 0x0000000107d62fe8 WLApplication:
17 widelands 0x0000000107d634b5 WLApplication:
18 widelands 0x0000000107d588b1 main + 689 (main.cc:51)
So I would assume the very first drawing on the start screen already causes Problems. (!)
can someone confirm wiht some other tool?
Klaus Halfmann (klaus-halfmann) wrote : | #13 |
SirVer:
will adding this to CMakeLists.txt add that AddressSanitizer ?
if(CMAKE_BUILD_TYPE STREQUAL "Debug")
set(WL_
Klaus Halfmann (klaus-halfmann) wrote : | #14 |
Nhh, this is incomplete, msut add the flag to the linker, too. As I get linker errors:
[ 84%] Linking CXX executable test_scripting
Undefined symbols for architecture x86_64:
"___asan_
_
and a lot more
kaputtnik (franku) wrote : | #15 |
SirVer (sirver) wrote : | #16 |
#15 looks like a double free: The texture is owned by the texturecache but somebody else freed it. something like this:
unique_ptr<...> a(new Texture(...));
texture_
GunChleoc (gunchleoc) wrote : | #17 |
I suspect that the culprit is representative_
Changed in widelands: | |
assignee: | nobody → GunChleoc (gunchleoc) |
GunChleoc (gunchleoc) wrote : | #18 |
I'm talking nonsense, those images as scaled -> new textures that need caching. I still suspect the playercolor though.
kaputtnik (franku) wrote : | #19 |
Don't know if this is related (bzr 8357):
Very rarely it happens that when clicking on some building to attack it, the corresponding window changes to a different window. E.g. just happens: Attacking a barbarian tower and clicking a lot, suddenly the window shows one of my Mills (showing settings how much wheat is in there) on a far away spot. Couldn't say if it is the same window with changed content, or if the Attack window closes and immediately the window of the mill is opened because this happens very fast (during much clicks).
This happens to me for about two times in the last few months.
GunChleoc (gunchleoc) wrote : | #20 |
#19 is a completely different bug - I have opened a new bug report: https:/
kaputtnik (franku) wrote : | #21 |
Thanks :-)
GunChleoc (gunchleoc) wrote : | #22 |
#2 is this bug: https:/
#5, #6 are particular to the fh1-multitexture branch and should have been fixed there.
#8, #15 has hopefully been fixed with r8363.
This leaves the following crashes still to analyze:
#1
#3
#10
Klaus Halfmann (klaus-halfmann) wrote : | #23 |
Got anotherone in bzr8371[widelands] when quitting
6 GLEngine 0x000000010d48e621 gliDestroyContext + 175
7 com.apple.opengl 0x00007fffcd74e232 CGLReleaseContext + 187
8 com.apple.AppKit 0x00007fffc656520d -[NSOpenGLContext dealloc] + 58
9 libSDL2-2.0.0.dylib 0x0000000106f730fd Cocoa_GL_
10 widelands 0x0000000105541234 Graphic::~Graphic() + 1220 (graphic.cc:125)
11 widelands 0x0000000105541655 Graphic::~Graphic() + 21 (graphic.cc:128)
Either something basic is broken at the very beginning and crashes when cleaning up
or some structure from the beginning is corupted during the game.
kaputtnik (franku) wrote : | #24 |
- backtrace.txt Edit (11.4 KiB, text/plain)
Another one:
Thread 1 "widelands" received signal SIGSEGV, Segmentation fault.
0x0000000000c5818c in std::__
at /usr/include/
524 { return __atomic_
Backtrace attached.
Klaus Halfmann (klaus-halfmann) wrote : | #25 |
I created a brach for testing this on OSX:
https:/
I still get +/ random like carsh in std fucntions when e.g. iterating over
WL objecs. I will try to add more and more assertions to that branch
to narrow down the problem.
Klaus Halfmann (klaus-halfmann) wrote : | #26 |
I checked with the malloc code the hard way (takes 3 min upto the spalsh screen).
What I found:
* If I just quit via CMD-Q all is fine.
* If I click "Exit Widelands" with the "Hand Pointer" Image I get the crash.
I get the crash most times in Graphic::~Graphic() at SDL_GL_
I now assume _evey_ image drawing does something potentially bad, we only
get away with it for quite some time.
I pushed some minor optimzations to the osx-malloc-check branch. Found #1697703.
Gun:
* Can we defer executing build_texture_
* Can I drop that code (for debuggign) completely?
* Do you you plan to put the images directly into the Image cache?
Could someone on Linux/gcc verify this with the glibc malloc?
So far for my findings
kaputtnik (franku) wrote : | #27 |
- valgrind.txt Edit (47.1 KiB, text/plain)
I get no crash, but run valgrind, wait until the screen with buttons appear and hit "Exit Widelands" (German laguage).
The command i used is:
valgrind --leak-check=full --track-origins=yes ./widelands
Summary (full log attached):
==1355== LEAK SUMMARY:
==1355== definitely lost: 7,791 bytes in 89 blocks
==1355== indirectly lost: 680 bytes in 85 blocks
==1355== possibly lost: 0 bytes in 0 blocks
==1355== still reachable: 219,010 bytes in 1,872 blocks
==1355== suppressed: 0 bytes in 0 blocks
==1355== Reachable blocks (those to which a pointer was found) are not shown.
==1355== To see them, rerun with: --leak-check=full --show-
==1355==
==1355== For counts of detected and suppressed errors, rerun with: -v
==1355== ERROR SUMMARY: 22 errors from 20 contexts (suppressed: 0 from 0)
Klaus Halfmann (klaus-halfmann) wrote : | #28 |
Thx for your test, aso of differnte SDL Implementation our result will vary.
Just found that I can insall valgrind via MacPorts.
What I extracted from the logs:
==1355== Syscall param writev(vector[...]) points to uninitialised byte(s)
...
==1355== by 0xC868C2: WLApplication:
...
Uninitialised value was created by a stack allocation
-> this _can_ caue a random crash, mmh
those memory leaks are not good, but will not cause a crash.
We should open a kind of clenaup bug for those, too.
Ill try that valgrind once I have some spare time again, thx for your checks.
kaputtnik (franku) wrote : | #29 |
- Boost crash on signals 2 Edit (10.1 KiB, text/plain)
Got another boost crash:
Thread 1 "widelands" received signal SIGSEGV, Segmentation fault.
0x0000000
at /usr/include/
711 BOOST_ASSERT( px != 0 );
Will attach the save game later. This save game crashed two times after a couple of playing. Unfortunately on first crash i didn't run gbd, so i am not sure if the second crash is the same.
kaputtnik (franku) wrote : | #30 |
- Got two crashes after some time when playing this save game Edit (828.8 KiB, application/octet-stream)
kaputtnik (franku) wrote : | #31 |
- Backtrace of Command::duetime Edit (9.8 KiB, text/plain)
Another one with the previous send save game, i guess when opening the statistics menu:
Thread 1 "widelands" received signal SIGSEGV, Segmentation fault.
0x0000000000de731e in Widelands:
at /home/kaputtnik
80 return duetime_;
Full backtrace attached.
GunChleoc (gunchleoc) wrote : | #32 |
* Can we defer executing build_texture_
No, unless we want to drop all graphics in it.
* Can I drop that code (for debuggign) completely?
Not easily - will probably need some recoding in g_gr->images(), because it's completely built on top of the texture atlas.
* Do you you plan to put the images directly into the Image cache?
I do not understand this question.
Klaus Halfmann (klaus-halfmann) wrote : | #33 |
>> * Do you you plan to put the images directly into the Image cache?
> I do not understand this question.
Currently the images are first put into some map and from there into
the image cache. I think this could be changed to put them into that
cache directly. (or maybe I still do not understand that cache)
GunChleoc (gunchleoc) wrote : Re: [Bug 1690519] Re: Random-seeming crashes in trunk | #34 |
>>> * Do you you plan to put the images directly into the Image cache?
>> I do not understand this question.
>
> Currently the images are first put into some map and from there into
> the image cache. I think this could be changed to put them into that
> cache directly. (or maybe I still do not understand that cache)
SirVer is dropping all the important bits into the first texture atlas
to reduce the frequency of texture swapping. This is an importan
performance feature, so no, there are no plans to remove this.
kaputtnik (franku) wrote : | #35 |
- backtrace.txt Edit (14.0 KiB, text/plain)
I don't know if i should post every crash i notice here. If i should stop spamming, please let me know :)
After playing yesterday for about 4 hrs with no crash, now there is another one after playing an half an hour:
widelands: /home/kaputtnik
Thread 1 "widelands" received signal SIGABRT, Aborted.
0x00007ffff510b670 in raise () from /usr/lib/libc.so.6
Yesterday i played mostly in window mode, whereas today i immediately press "F" during loading of the save game. The crashes mentioned before by me are all started this way: Start widelands (window mode) -> load saved game -> switch to fullscreen. Since i encounter those bugs only in fullscreen mode, i have the feeling that this is maybe importand?
kaputtnik (franku) wrote : | #36 |
- backtrace.txt Edit (8.1 KiB, text/plain)
Happened again:
Thread 1 "widelands" received signal SIGSEGV, Segmentation fault.
0x0000000
2335 static_cast<void const*>
To be sure that there is no issue with my computer i will run a memory test soon.
kaputtnik (franku) wrote : | #37 |
- backtrace.txt Edit (17.4 KiB, text/plain)
Now a crash in my video driver:
Thread 1 "widelands" received signal SIGSEGV, Segmentation fault.
0x00007ff
Did run a memory test before, but it found no issues.
GunChleoc (gunchleoc) wrote : | #38 |
Looks like a symptom of a bug in the font renderer.
kaputtnik (franku) wrote : | #39 |
Well, that's really mad... today i played for hours with no crash.
GunChleoc (gunchleoc) wrote : | #40 |
I think we have squashed them all now with the help of ASan. Let's open new bug reports if we should run into any of these again.
Changed in widelands: | |
status: | Confirmed → Fix Committed |
assignee: | GunChleoc (gunchleoc) → nobody |
Klaus Halfmann (klaus-halfmann) wrote : | #41 |
Confirm, let's cleanup all of these.
GunChleoc (gunchleoc) wrote : | #42 |
Fixed in build20-rc1
Changed in widelands: | |
status: | Fix Committed → Fix Released |
Crash reported in https:/ /code.launchpad .net/~widelands -dev/widelands/ fh1-winconditio ns/+merge/ 323987/ comments/ 849240