ControlObjectSlave deletion race condition

Bug #1406124 reported by Daniel Schürmann on 2014-12-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mixxx
Critical
Daniel Schürmann

Bug Description

This has happens one time during testing the spinny assertion, but seams to be unrelated.

Debug [Main]: m_sTracks.count() = 1
Debug [Main]: WCoverArt::slotCoverFound WCoverArt(0xd167c98) "CoverInfo(METADATA,GUESSED,,11181,/home/daniel/Musik/deichkind_-_leider_geil_%28leider_geil%29.mp3)" QSize(300, 187)
Debug [Main]: WCoverArt::slotCoverFound WCoverArt(0xba4e998, name = "DeckCoverArt") "CoverInfo(METADATA,GUESSED,,11181,/home/daniel/Musik/deichkind_-_leider_geil_%28leider_geil%29.mp3)" QSize(300, 187)
Debug [AnalyserQueue 1]: Prioritizing "Leider Geil (Leider Geil)" "/home/daniel/Musik/deichkind_-_leider_geil_%28leider_geil%29.mp3"
Debug [AnalyserQueue 1]: Analyzing "Leider Geil (Leider Geil)" "/home/daniel/Musik/deichkind_-_leider_geil_%28leider_geil%29.mp3"
Debug [Main]: WSpinny::slotCoverFound WSpinny(0xbc67558) "CoverInfo(METADATA,GUESSED,,11181,/home/daniel/Musik/deichkind_-_leider_geil_%28leider_geil%29.mp3)" QSize(300, 187)
Debug [AnalyserQueue 1]: AnalysisDAO fetched 2 analyses, 1495575 bytes for track 2 in 55 ms
Debug [AnalyserQueue 1]: Reading waveform from byte array: allSignalSize 168746 visualSampleRate 441 audioVisualRatio 100
Debug [AnalyserQueue 1]: Reading waveform from byte array: allSignalSize 3842 visualSampleRate 10.0355 audioVisualRatio 4394.4
Debug [AnalyserQueue 1]: AnalyserWaveform::loadStored - Stored waveform loaded
Debug [AnalyserQueue 1]: Beat calculation will not start
Debug [AnalyserQueue 1]: Keys version/sub-version unchanged since previous analysis. Not analyzing.
Debug [AnalyserQueue 1]: Key calculation will not start.
Debug [AnalyserQueue 1]: Skipping track analysis because no analyzer initialized.
Debug [Main]: Committing transaction on "qt_sql_default_connection" result: true
Debug [Main]: guess the size of the window decoration
Debug [Main]: Now in rebootMixxxView...

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2211fb40 (LWP 6569)]
0xb6d2cfc5 in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) () from /usr/lib/i386-linux-gnu/libQtCore.so.4
(gdb) bt
#0 0xb6d2cfc5 in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) ()
   from /usr/lib/i386-linux-gnu/libQtCore.so.4
#1 0x086674b0 in valueChanged (_t1=0,944671630859375, this=0xba2fee8) at lin32_build/moc_controlobjectslave.cc:112
#2 slotValueChanged (pSetter=<optimized out>, v=<optimized out>, this=0xba2fee8)
    at lin32_build/../src/controlobjectslave.h:103
#3 ControlObjectSlave::qt_static_metacall (_o=0xba2fee8, _c=QMetaObject::InvokeMetaMethod, _id=5, _a=0x2211ecc4)
    at lin32_build/moc_controlobjectslave.cc:63
#4 0xb6d2d0f7 in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) ()
   from /usr/lib/i386-linux-gnu/libQtCore.so.4
#5 0x08144ed3 in ControlDoublePrivate::valueChanged (this=this@entry=0x8d35068, _t1=_t1@entry=0,944671630859375,
    _t2=_t2@entry=0x8d353d8) at lin32_build/control/moc_control.cc:101
#6 0x08139da1 in ControlDoublePrivate::setInner (this=0x8d35068, value=0,944671630859375, pSender=0x8d353d8)
    at src/control/control.cpp:197
#7 0x0813a2ca in ControlDoublePrivate::set (this=this@entry=0x8d35068, value=0,944671630859375,
    pSender=pSender@entry=0x8d353d8) at src/control/control.cpp:184
#8 0x08498127 in set (value=<optimized out>, this=0x8d353d8) at src/controlobject.h:94
#9 EngineVuMeter::process (this=0x8d0de50, pIn=0xac762008, iBufferSize=2048) at src/engine/enginevumeter.cpp:74
#10 0x084913fc in EngineMaster::process (this=0x8d2d1a0, iBufferSize=2048) at src/engine/enginemaster.cpp:448
#11 0x086e77b4 in SoundManager::onDeviceOutputCallback (this=<optimized out>, iFramesPerBuffer=<optimized out>)
    at src/soundmanager.cpp:483
#12 0x086e1d02 in SoundDevicePortAudio::callbackProcessClkRef (this=0x9407d78, framesPerBuffer=1024, out=0xbe3d340, in=0x0,
    timeInfo=0x2211f2b8, statusFlags=0) at src/sounddeviceportaudio.cpp:817
#13 0xb7f99791 in ?? () from /usr/lib/i386-linux-gnu/libportaudio.so.2
#14 0xb7f9b714 in ?? () from /usr/lib/i386-linux-gnu/libportaudio.so.2
#15 0xb7fa4397 in ?? () from /usr/lib/i386-linux-gnu/libportaudio.so.2
#16 0xb675ef70 in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#17 0xb62e647e in clone () from /lib/i386-linux-gnu/libc.so.6

RJ Ryan (rryan) on 2015-01-05
Changed in mixxx:
milestone: none → 1.12.0
status: New → Confirmed
importance: Undecided → Critical
Max Linke (max-linke) wrote :

From the stack trace it seems like there was a problem with the pointer that you tried to send. Can't we make a switch to smart pointers in the future to avoid error like this or at least to get the backtrace at places where the problem actually occured?

Owen Williams (ywwg) wrote :

Yeah we are definitely using more smart pointers lately... you'll see more QScopedPointer<ControlObjectSlave> in new code. But of course this is on the list of things to do after a release.

RJ Ryan (rryan) wrote :

The backtrace seems to indicate that when the VU meter CO was set by the engine, the CDP then rebroadcasted the valueChanged signal to a COSlave that was concurrently being deleted by the main thread. That's consistent with changing the skin since the VU meter CO slave was just deleted. I suspect if Daniel had posted a backtrace of all threads we would see that the main thread was in the middle of ~WVuMeter.

(I'm not sure what smart pointers would tell you here? In this particular case, the control is held by ControlWidgetConnection and it's a ControlObjectSlave wrapped in a QScopedPointer)

So, shoot. That's a legit race condition.

Using an AutoConnection for CDP <-> COSlave signals would solve the issue for COSlaves that are deleted from the main thread. But if a CO slave's life cycle is not controlled by the main thread and that thread does not have a Qt event loop then we're kind of stuck.

One option is to use QObject::deleteLater for COSlaves. But if a COSlave is not created in the main thread then it won't be deleted until that thread ends. We could moveToThread(mainThread) and then call deleteLater? Thoughts?

Max Linke (max-linke) wrote :

Please correct me if I'm wrong but I think this wouldn't happen using a QSharedPointer. Passing the SharedPointer by value will increase the reference counter, then if the SharedPointer goes out of scope in the main thread it won't be deleted until the copy of the SharedPointer in the engine thread goes out of scope.

Also just to make sure I get it right. Is the receiver being deleted or the QObject that was send to the receiver?

RJ Ryan (rryan) wrote :

Oh, I see what you're suggesting. I don't think we want a ControlObjectSlave in the GUI to live as long as the ControlDoublePrivate does.

A COSlave is like a client in a client/server setup and ControlDoublePrivate is the server. COSlave listens for Qt signals from the CDP to get updates and it calls methods on the CDP (like set() or reset()) to request changes. The server doesn't hold references to the clients and clients should be able to come and go as different parts of Mixxx that need access to the control value are created and destroyed. We could make COSlaves live forever (or as long as the server does) but that isn't really necessary -- it would create more work to do (slot invocations) every time the CDP changes and it would be a tiny waste of memory.

RJ Ryan (rryan) on 2015-04-22
summary: - segfault switching to LateNight skin
+ ControlObjectSlave deletion race condition
Daniel Schürmann (daschuer) wrote :

The issue is this
https://github.com/mixxxdj/mixxx/blob/679ffc1876ddd1f7209857cf984692a88d22bf59/src/controlobjectslave.cpp#L47
direct connection.

The calling thread, changing the co is not aware that the Slave CO is about to be deleted.

> Deleting a QObject while pending events are waiting to be delivered can cause a crash. You must not delete the QObject directly if it exists in a different thread than the one currently executing. Use deleteLater() instead, which will cause the event loop to delete the object after all pending events have been delivered to it.

I have also dig though the deleteLater() code. It just makes sure that the Object is finally deleted by the same thread that runs the event queue for it.
It does not help in this special case since the Slave CO is already deleted in the correct thread, but it may help in other cases.

What does it mean for us:
* We must not delete Slave COs form a thread that does not drive its event queue, like our engine thread, which has none.
** Every QObject has a event Queue. In case of our engine thread, it is the main queue, since it is inherited during initializing the engine. This means we have no issue here, since the engine is also deleted by the main thread.
* We must not delete a CO when we have a direct connection from a different thread. The thing becomes worse because we always have an internal direct connection.

I think the later can be fixed by putting some more logic into the connection setup. We probably handle the connection types separate and setup the internal connections accordingly.
Draft:
* connectValueChanged Auto -> COP = Auto / SCO = Auto
* connectValueChanged Direct -> COP = Direct / SCO = Direct
* connectValueChanged Queued -> COP = Auto / SCO = Queued
* connectValueChanged BlockingQueued -> Assert(false)

Any thoughts?

Changed in mixxx:
assignee: nobody → Daniel Schürmann (daschuer)
Daniel Schürmann (daschuer) wrote :
Changed in mixxx:
status: Confirmed → In Progress
milestone: 2.0.0 → 2.1.0
Changed in mixxx:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers