2.53 crashes in Qt5Gui.dll when starting conversions

Bug #1557147 reported by Torbjorn Lindgren
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Unassigned

Bug Description

Since upgrading from 2.52.0 to 2.53.0 Calibre has crashed 5 times already, after not crashing for many, many months (or even years!). It started pretty much immediately after 2.53.0 was installed, it's extremely unlikely to not be related.

I've always been doing the same thing when it hits, specifically starting many conversions in very rapid succession (fast enough that I'm doing it blind) and it only crashes if Calibre has been running for an extended period before I do that. IE, c, Enter, Arrow-down, repeat all three 10+ times in rapid succession, IF Calibre has been running for long somewhere around repetition 3-5 Windows will bring up the dreaded "application doesn't respond" pop-up and terminates Calibre.

When I start Calibre afterwards I find none of the files has finished converting, but it will easily take handle me queuing up 20+ book conversions this way (done this way because batch encoding doesn't set which file was the original SOURCE so doing batch conversions means it may try to use the finished product as source if you need to convert again).

Looking at the Event Viewer I can see that every time Windows terminates calibre this way it's due to exception code 0xc0000005 (Access Violation) and the faulting location is always 0x000cd7bc which it says is inside Qt5Gui.dll.

You should be able to load 2.53.0 the modules up in a debugger and find out which Qt function it is in.

OS: Windows 7 Home Premium (SP1, just a few of the latest patches currently pending), 64-bit.

Data from Event Viewer:

Faulting application name: calibre.exe, version: 2.53.0.0, time stamp: 0x56e23d99
Faulting module name: Qt5Gui.dll, version: 5.4.1.0, time stamp: 0x5510e4f9
Exception code: 0xc0000005
Fault offset: 0x000cd7bc
Faulting process id: 0x22bc
Faulting application start time: 0x01d17d71c9d09d8a
Faulting application path: C:\Program Files (x86)\Calibre2\calibre.exe
Faulting module path: C:\Program Files (x86)\Calibre2\DLLs\Qt5Gui.dll
Report Id: 45a7834e-ea05-11e5-aa6f-f816542c8f43

Faulting application name: calibre.exe, version: 2.53.0.0, time stamp: 0x56e23d99
Faulting module name: Qt5Gui.dll, version: 5.4.1.0, time stamp: 0x5510e4f9
Exception code: 0xc0000005
Fault offset: 0x000cd7bc
Faulting process id: 0x40e4
Faulting application start time: 0x01d17d70d350bf6d
Faulting application path: C:\Program Files (x86)\Calibre2\calibre.exe
Faulting module path: C:\Program Files (x86)\Calibre2\DLLs\Qt5Gui.dll
Report Id: 04fe7c2f-e965-11e5-aa6f-f816542c8f43

Faulting application name: calibre.exe, version: 2.53.0.0, time stamp: 0x56e23d99
Faulting module name: Qt5Gui.dll, version: 5.4.1.0, time stamp: 0x5510e4f9
Exception code: 0xc0000005
Fault offset: 0x000cd7a2
Faulting process id: 0x276c
Faulting application start time: 0x01d17cb4f25c1af7
Faulting application path: C:\Program Files (x86)\Calibre2\calibre.exe
Faulting module path: C:\Program Files (x86)\Calibre2\DLLs\Qt5Gui.dll
Report Id: 0974b993-e964-11e5-aa6f-f816542c8f43

Faulting application name: calibre.exe, version: 2.53.0.0, time stamp: 0x56e23d99
Faulting module name: Qt5Gui.dll, version: 5.4.1.0, time stamp: 0x5510e4f9
Exception code: 0xc0000005
Fault offset: 0x000cd7bc
Faulting process id: 0x3f74
Faulting application start time: 0x01d17c937d492e42
Faulting application path: C:\Program Files (x86)\Calibre2\calibre.exe
Faulting module path: C:\Program Files (x86)\Calibre2\DLLs\Qt5Gui.dll
Report Id: 2b191b57-e8a8-11e5-aa6f-f816542c8f43

Faulting application name: calibre.exe, version: 2.53.0.0, time stamp: 0x56e23d99
Faulting module name: Qt5Gui.dll, version: 5.4.1.0, time stamp: 0x5510e4f9
Exception code: 0xc0000005
Fault offset: 0x000cd7a2
Faulting process id: 0x297c
Faulting application start time: 0x01d17be853ba08e2
Faulting application path: C:\Program Files (x86)\Calibre2\calibre.exe
Faulting module path: C:\Program Files (x86)\Calibre2\DLLs\Qt5Gui.dll
Report Id: b850bb25-e886-11e5-aa6f-f816542c8f43

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1557147

What's a long time? i.e. how long does it have to be running for the
crash to happen? I cannot reproduce this crash simply leaving calibre
running for half and hour and then queueing up 20 books for conversion as
you describe.

Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

It's all over the place, I see that in one case it crashed 7 minutes apart which limits how many I conversions could have done.
But most are many hours apart, which most likely means somewhere between 30 and 200 conversions before it hit.

So far it's crashed 5 times when I'm doing many conversion and zero times when I'm doing one at a time. Considering the number of conversions it OUGHT to have crashed a few times during one at a time conversions, I'm fairly confident I've done more of conversions started normally during the period than conversions where I start many that way.

But... sometimes weird things happens with statistics so it's not proof of anything. But it would also explain why you don't have more reports if it requires special circumstances to trigger.

It does feel a bit like a race condition somewhere, since I made the report I've tried not queuing up things quite so fast and it's not crashed yet but it's much too early to be sure (random bugs sucks). If it doesn't crash in the next few hours it starts to get suggestive.

Do you have any suggestions for how I can collect more data that could help pin it down?
Want me to downgrade to 2.52.0 and see if it goes away, it's always possible something else changed, not calibre...
Run it from source, I've only run pre-made version since 1.3x something but I've done it before and if there's a chance it would help pinning this down I'll put in the time to get it running from source again.

Other information:
Hardware: Intel Core i7-4710MQ, 16GB RAM, Nvidia GTX 860M & Intel integrated video. All SSD storage.
This is a quad-core, I checked and I've not changed the max simultaneous jobs (3).

Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

And it just crashed, but not during conversion. I switched library, then switched back and got a GUI message about "failed to start thread". Switching library one more time and it hung and the OS killed it. Very different crash report too.

I'm starting to lean towards either "roll back to 2.52 to verify it isn't something else" OR "reboot because it's 1+ months since last time" (and almost two months since it installed security updates that required reboots). I'll try one or the other tomorrow if you haven't suggested something specific for me to test.

Faulting application name: calibre.exe, version: 2.53.0.0, time stamp: 0x56e23d99
Faulting module name: MSVCR90.dll, version: 9.0.30729.6161, time stamp: 0x4dace5b9
Exception code: 0x40000015
Fault offset: 0x0005beae
Faulting process id: 0x2a50
Faulting application start time: 0x01d17e120a490019
Faulting application path: C:\Program Files (x86)\Calibre2\calibre.exe
Faulting module path: C:\Windows\WinSxS\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.6161_none_50934f2ebcb7eb57\MSVCR90.dll
Report Id: 129f1f46-eaf4-11e5-aa6f-f816542c8f43

Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

Just to clarify, the crash was after 30+ conversion and then switching Library back and forth. I've now tried to just switch around between libraries on a freshly started Calibre, no crash. I hate crashes without any obvious traceable reason.

Revision history for this message
Kovid Goyal (kovid) wrote :

I've left calibre running ~24hrs and done ~100 conversions in the manner you describe, switched libraries 10 times and I simply dont get any crashes. And given that there are no other reports of random crashes -- it does seem like something on your system changed.

So some things to try:

1) Reboot in safe mode and see if your can reproduce the crashes
2) downgrade to 2.52 and see if the crashes go away.

If they do, then you can run from source as described here: http://manual.calibre-ebook.com/develop.html (onlt takes a few minutes to setup) and I will try a few things (only a few of changes were made to GUI related code in 2.53 so it should be relatively easy to track down which one is the cause).

Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

2.53.0 crashed again after I installed all OS updates and rebooted (in normal mode), the official 2.52.0 version has not crashed yet but I'm not counting that as "working" until tomorrow evening!

This is because MOST of the crashes has been after starting it up one evening, doing a number of conversion, then leaving it running and until the next evening where I then continuing with more conversions the next evening. But there's at least one that crashed quickly (7 minutes), so I really don't have a fixed pattern.

This means it will take me a long time to be sure, especially since I have no sure way to provoke it and I expect there's little reason for you to try it until I have more data.

The git clone is done, if 2.52 survives long enough I'll start the git bisect process, it says 29 revisions and roughly 5 steps. Just bisecting "src" drops it to 20 revision (~4 steps), just one less bisect step might not be worth the admittedly very small risk of missing the real reason.

Revision history for this message
Kovid Goyal (kovid) wrote :

Just bisecting src/calibre/gui2 should be enough. There are really only
three commits that affect the conversion dialog:

61ae37d726
941f395ca3
fa0533ec0d

Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

I'm now close to sure that 2.52 is stable and doesn't crash for me... Non-deterministic crashes sucks.

All three commits you mention are near 2.53 and close together, looks like if I start with testing 2da3c9d30ba99 it should cut down the testing necessary a lot if it's in one of those commits.

Revision history for this message
Kovid Goyal (kovid) wrote :

Yes, just bracket those commits. And the most likely commit causing the crash is IMO 61ae37d726

Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

With 2.52 "base" installed (.exe), and running from source, checkout 2da3c9d is confirmed to be OK, edd3e56 is (2.53) confirmed bad (got lucky and crashed it quickly).
So it's down to 4-5 commits, trying with 61ae37d726 now as discussed.

Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

Yes, it's 61ae37d726 that introduces the crash...

And I've think I figure out to trigger it very quickly on demand which will help further testing...

So far I've crashed 61ae37d726 4 times in a few minutes, anything earlier handles the exact same files and commands fine. I can only crash Calibre if I queue up a number (usually crashes around the 4-8 mark) of conversions very quickly by sending in keypresses blind, if I wait < 1 second between starting each conversion it won't crash even though I'm queuing up 12 conversions in total, I've tried that 5 times and it's just not crashing when I go slower.

I LOVE the CSS transform stuff in 2.53, so any ideas on how to progress?
Do you have any test commits you want me to try?
I see the Visual Studio guide, do you think it'll give more information if I run it that way? Or the remote debugger (either locally/Windows or my CentOS 7 box I guess)
I've always been more C/C++/Perl coder and it's long since I programmed professionally but I've done some casual Python code so I guess I'll also look at it but without familiarity with the code it's going to be uphill.

Revision history for this message
Kovid Goyal (kovid) wrote : Fixed in master

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1557147

I have committed a possible fix.

Changed in calibre:
status: New → Fix Released
Revision history for this message
Torbjorn Lindgren (torbjorn-lindgren) wrote :

I've confirmed that this change fixes my crash.
701bec4 crashes but 10ed729 survives anything I can throw at it, the only difference is the QPlainText commit.
If you want anything else tested please don't hesitate to assign me testing tasks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.