Analyser queue signal rework

Bug #1085613 reported by Daniel Schürmann on 2012-12-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mixxx
Medium
Daniel Schürmann

Bug Description

The attached patch is a rework of the signals emitted from Analyser queue patch.

* Removes unused Data from the signals
* It puts only one signal on the Qt event Queue of the GUI thread each progress step, instead of two.
* It ensures that only one process signal is on the GUI Event queue a time. Before the patch, the queue could get poluted with progress updates in case of a long process in the GUI thread.
* No priority inversion
* Small analysis speed up from 7.5 s to 7 s for a 3:30 track on my system.

In general we should prevent to fire timer driven signals between threads, to prevent the GUI Thread from process outdated data when there is high load anyway.
This patch is an improved solution from what was rejected in Bug #1008721.

Daniel Schürmann (daschuer) wrote :
Changed in mixxx:
assignee: nobody → Daniel Schürmann (daschuer)
status: New → In Progress
importance: Undecided → Medium
Changed in mixxx:
milestone: none → 1.11.0
RJ Skerry-Ryan (rryan) wrote :

Thanks for putting the patch together. I am trying to understand why it gives a speed improvement.

The code in lp:mixxx/1.11 emits 2 signals per progress update:
AnalyserQueue trackProgress(TrackPointer, int) and
TrackInfoObject analyserProgress(int)

Both are queued in the event queue so their slots are invoked by the main thread.

The code in your patch emits:
AnalyserQueue updateProgress()
AnalyserQueue trackProgress(int) and
TrackInfoObject analyserProgress(int)

But only updateProgress should be queued in the event queue while the latter 2 are direct invoked.
Just making sure I understand the intended benefit of doing it. Correct me if I said something wrong.

The progressUpdateInhibitTimer delays at least 60ms between progress updates so we are really talking about the difference between 2 queued events every 60 ms and 1 queued event every 60ms.

My question is does this really have a measurable benefit on analysis time and Qt event queue length? The semaphore signalling increases the complexity of this section of code by a lot so I'd rather not do it if we don't get a benefit.

You measured 7.5s -> 7s. Did you measure it many times and take an average or is this just a one-off? I wonder what the average and variance of the distribution of analysis times for a single track is if you analyse it over and over. It's possible the 0.5s speed increase was just noise or within the variance of the measurement. I wonder how we could easily test if this is a statistically significant change.

Also there are many other factors at play. The engine is always calculating while you run these tests so it's adding to the measurement noise. We could make a standalone binary that instantiates the analyser queue and tries to analyze a track over and over.

It seems very odd that doing 1 extra queued event every 60ms is enough to cause 0.5s of overall analysis time difference.

Some other minor code comments:

* How does WOverview::slotTrackLoaded get called? I see no connections.

* How is WOverview::m_trackLoaded different from checking if WOverview::m_pCurrentTrack is not null?

* In DlgPrepare, it looks like slotTrackAnalysisProgress and trackAnalysisFinished are just rebroadcasts so they could just be a SIGNAL to SIGNAL connection instead of having a SLOT.

Daniel Schürmann (daschuer) wrote :

Hi RJ

Thank you for your review. I Think you read the code right.

The speed improvement is not the cause for this patch. I just put the argument in to show that the patch does not put additional delay to the analyser process. I measure it about five time with a manual stopwatch, so not that reliable.
Also for me it is not that odd, because an additional benefit is that the new updateProgress() signal does not put any additional data on. Transferring data by the QT event queue leads finally would to a malloc call, which is not a high performance task.

The main benefit of this patch is that it stops the Analyser from overload the QT event Queue, if the GUI thread is already busy.
The semaphore stops emitting and discards the signals and allows it again until the previous emit is processes.
In my tests on my Atom hardware I can see, depending on the Waveform settings extreme jerking when the Analyser is running.
GUI Control like waveform scratch is hard to do in this case.

With patch I have still jerking but not that extreme. The original solution with the priority inversion was slightly better but it puts additional delay on the analyser process for all users.

This patch is a tiny part of the whole performance issue. I can live without it, but ...

---

WOverview::slotTrackLoaded is connected in legacyskinparser to bastrackplayer, this is missing in the patch.

WOverview::m_trackLoaded is needed for my upcoming patch which will show "Ready to play, analyzing .." when the track is loaded but the analysis has not started.

RJ Skerry-Ryan (rryan) wrote :

OK -- if you see it improve jerking repeatably on your Atom then I can't argue with that :). Go ahead and commit it.

But I still don't understand. It's true malloc isn't free and the reference counting from copy-constructing TrackPointers isn't cheap but when we talk about 'expensive' and 'cheap' we're talking about the nanosecond timescale.

For something like jerking to happen I think we're having a much more serious problem than an extra malloc here and there every 60ms.

Daniel Schürmann (daschuer) wrote :

Hi RJ,
Yes you are right. We will have a timing improvement by not transferring data though the Qt events but that will not gain 500 ms improvement. In my test I have about 125 progress events. Saving 4 ms per progress call due to that is unrealistic.

There is no measurable Timing improvement on an idle System when doing mass analysis with this patch.
The improvement was measured on deck analysis with the other deck playing. So the benefit is due to not processed progress updates or simply a measurement fault.

I will check that.

Daniel Schürmann (daschuer) wrote :

Ah, here it is:
On my Netbook tryAquire returns 27 times false. So this should be the main performance gain.

Daniel Schürmann (daschuer) wrote :

Committed to lp:mixxx/1.11 #3572

Changed in mixxx:
status: In Progress → Fix Committed
RJ Skerry-Ryan (rryan) on 2013-05-09
Changed in mixxx:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers