New waveform doesn't show mids/his very well

Bug #976650 reported by Owen Williams on 2012-04-08
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mixxx
Medium
Owen Williams

Bug Description

The new waveform is wonderful and silky, but it seems to smooth out mid and high-end detail too much. I have a track that has a breakdown in which the old code shows plenty of detail (picture attached), but the new waveform shows a completely flat line. We should tweak the rendering of the new waveform so it can pick up low volume / high frequency detail better.

Thomas Vincent (vrince) wrote :

Yeah thing is to prevent flickering due to waveform strech (pitch), the range of value taken into acount to place waveform amplitude had to be increased. In other words, usings more consecutive values to compute waveform amplitude "smooth" the overall shape of the waveform.

For the moment low/mid/high are handled with the exact same code but in my opinion mid/high should be treated differently ... first there "power" is almost all the time too low to be efficiently display, then small peaks are flatten down by using a too wide "range".

I know it's not the answer your are waiting for but I started to adress those issues with a GLSL version of the waveform, but I don't have enougth time to make it stable enougth for 1.11 ...

Nervertheless I'll try to "tweak" the actual rendering code but keep in mind that small detail will produce some flickering.

Is the flicker due to the sub-pixel scrolling? Can you try removing that? (The fluttering effect it causes especially on LCD monitors looks horrible.)

Thomas Vincent (vrince) wrote :

Indeed it's link to sub pixel scrolling & scalling. I experiented a non subpixel version of it but all other renders beased on position [0;1] were displayed to their nearest integral positions in pixel world (no-subpixel) so they didn't moved in block and some little offset appear between those and the signal due to float to int rounding ... First feedback was not so good ...

You mean things like hot cue and beat markers? Yes, all renderers would have to be changed, then checked to make sure they are in sync with the actual audio being output. But to confirm, does removing sub-pixel rendering solve the flicker problem you mentioned in #2?

In many cases we have more visual samples than screen pixels so we must
combine the visual samples 'underneath' a given pixel. Today that is done
via a max() function on that data. The flickering issue Tom is referring to
in #2 is because as the maximum visual sample in any given pixel moves from
pixel to pixel, the pixels 'jump' or flicker.

This is actually caused by the fact that we do /not/ do sub-pixel rendering
as we did in the old waveform code. If we did subpixel rendering as we did
in <=1.10.x then we would draw every single visual sample we have and have
OpenGL take care of drawing them at a particular zoom level.

To solve this flickering problem, we sample slightly more visual samples
than lie under a given a screen pixel. The downside to this is that as Tom
says, it has a smoothing effect on the waveform which is part of the
problem that Owen reports above (high/mid detail is reduced).

On Sun, Apr 8, 2012 at 3:33 PM, Sean M. Pappalardo <
<email address hidden>> wrote:

> You mean things like hot cue and beat markers? Yes, all renderers would
> have to be changed, then checked to make sure they are in sync with the
> actual audio being output. But to confirm, does removing sub-pixel
> rendering solve the flicker problem you mentioned in #2?
>
> --
> You received this bug notification because you are a member of Mixxx
> Development Team, which is subscribed to Mixxx.
> https://bugs.launchpad.net/bugs/976650
>
> Title:
> New waveform doesn't show mids/his very well
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mixxx/+bug/976650/+subscriptions
>

RJ Skerry-Ryan (rryan) on 2012-04-30
Changed in mixxx:
milestone: none → 1.11.0
tags: added: waveform
Owen Williams (ywwg) wrote :

So what's the recommended way of fixing this bug? I realized that lack of waveform detail is mostly why I'm not using 1.11 right now

Thomas Vincent (vrince) wrote :

I have a version of the filtered waveform based on triangle stripe that mostly reproduce the details we had have in 1.10.
Problem is this version is quite dependent of your OpenGL configuration and if antialising is switch off in your driver panel it should blink like hell. I don't know if it's possible to introduce this variation for 1.11.

Owen Williams (ywwg) wrote :

I did a lot of poking at the waveform code tonight and I think I figured out the biggest reason for this bug.

The waveform analyzer uses unsigned chars to collect the data:

m_stride.m_filteredData[Right][Low] += m_buffers[Low][i]*m_buffers[Low][i];

However, the butterworth filters work on and return CSAMPLEs, not
unsigned char. Values less than 1 gets rounded down to zero, which
means that the filtered waveform data for mids and highs is often just a
long string of zeros. (the actual numbers are often 10^-6 magnitude)

I was able to fix most of the problems by converting all of the unsigned
char's to CSAMPLES. I've attached a patch that shows that work. You'll need to delete your existing ~/.mixxx/analysis files to see the improvement.

While the waveforms look much nicer now, I've completely broken the serialization code. When mixxx tries to load the waveform from the file it's junk data. Also, all the other waveforms have to be rewritten since I've put things on a 0.0-1.0 scale instead of 0-255. Sorry!

Should I continue working on this fix, or should it be fixed some other way? (maybe I should pre-scaling the values to unsigned chars and pre-apply the power function?)

Owen Williams (ywwg) wrote :

Ok I took my own idea and did all the work in the analyzer. Now we don't have to rewrite the waveform viewers and the serialization code all works.

Thomas Vincent (vrince) wrote :

Using unsigned char was intended to reduce memory space and disk usage too. Perhaps linear storage with a 0-255 range is not a good idea since value are really low, you right on this one. But in this case instead of using CSAMPLE (with is basically float ?) we should store data with a well chosen non linear scale. What about a logarithmic scaling ? or a polynomial one ? If we stick we unsigned char we can pre-compute both scale transformation in a look-up table to limit run time overhead ...

Owen Williams (ywwg) wrote :

Yeah I agree that 8 bits should be plenty for storing the waveform data. My latest patch (comment 10) does precompute the scaling and I do convert from CSAMPLE in analyserwaveform.h (which is where the 127 scaling factor comes in).

Can you test the patch and see if the waveform scaling works well for you? If so, I am satisfied that this particular bug is solved. I see separate detail for bass, mids, and treble, and the new waveform is now much more useful to me.

However the CPU of the waveform analyzer is a big problem, mostly due to the filters and not helped by my addition of a power function. That's a different bug however.

Thomas Vincent (vrince) wrote :

I'll push a branch tomorrow with an other waveform based on pure OpenGL lines.
Really small details are quite easy to see and it look much more like 1.10. You will tell me if it suits you.
I'll try to make a simpler version in the actual "simple" version cause perf with pure OpenGl calls is way better.
It should solve a bit the performances issue too ...

Thomas Vincent (vrince) wrote :

As RJ said this is highly related to the way data is accumulated in waveform analyser and it seems he implemented a better way to do it. This plus the "line version" of te waveform signal (really similiar to 1.10) should be plenty enougth to solve this issue.

Changed in mixxx:
status: New → In Progress
assignee: nobody → Thomas Vincent (vrince)
Thomas Vincent (vrince) wrote :

What is the status of this one are the proposed accumulation fix already committed ?

Owen Williams (ywwg) wrote :

It's not committed yet, it's a big/delicate enough patch that I didn't want to just push it without approval.

Thomas Vincent (vrince) on 2012-05-23
Changed in mixxx:
assignee: Thomas Vincent (vrince) → nobody
RJ Skerry-Ryan (rryan) wrote :

I've adapted Owen's patch from #10 to fix the case where the user has waveforms generated from trunk or a previous 1.11.0 alpha release.

Now that we're max()'ing across the strides, I'm not sure that a pow(..., 0.316) is necessary. For me, the pow() caused the waveform to be somewhat muddled while I think the max() alone looks pretty good.

I also removed the factor of 0.5 from the replaygain in EnginePregain. Not sure why that was there (maybe Vittorio will comment).

Owen Williams (ywwg) wrote :

I found that having the pow() nicely accentuated low-level information in mids and highs, but if people think it looks ok without it I have no problem getting rid of it.

Owen Williams (ywwg) wrote :

Maybe the 0.5 gain explains why I've always needed a +5db pregain in my replaygain settings? It looks like this fix will cause everyone's levels to go up -- I guess that's ok because it's fixing a bug (cutting data in half and then doubling it can't be good for sound quality! That's like 15 bit sound instead of 16).

Thomas Vincent (vrince) wrote :

Nice, I have cross check with and independent pulse audio vumeter and output mixxx sound is just at the right level without the 0.5 factor. So for me it's definitely a fix.

Changed in mixxx:
importance: Undecided → Low
RJ Skerry-Ryan (rryan) wrote :

I opted for not removing the 0.5 gain because removing it caused clipping on many of my tracks at unity gain (no gain applied at all). I haven't dug into the history but I thought this was added when we added RG. It seems to have been hiding that we are causing clipping without any gain adjustments on a track (which shouldn't be the case if we just decode a track and play it). Instead I adjusted the waveform gain to compensate for that fixed gain.

As a minor performance optimization since we are now max()'ing instead of summing squares it suffices to take the max of fabs() without the squaring and then apply the square only to the max value of the stride in scaleSignal(). Owen -- I think your pow 0.316 works nicely for the high band. For the low and mid I think 0.5 works best. Since this is just sqrt it works out to a no-op for the mid/low/all bands so that makes the analysis even faster.

I'm committing your patch w/ my changes now. I think it looks nice for now and we can always iterate on this since the analyses are now versioned (so we have the ability to update and people's versions of Mixxx will automatically generate the latest version of our waveforms).

Thanks for looking into this!

Changed in mixxx:
assignee: nobody → Owen Williams (ywwg)
importance: Low → Medium
status: In Progress → Fix Committed
jus (jus) wrote :

This patch may need a little tweaking. Using the latest trunk with fresh DB and default settings
* Waveforms are clipped on top/bottom, even if visual gain is changed from 1.5 (default) to 1.0
* It makes the high/mid/low visual gain options useless cause you cant go below 1.0

Generally i find the high frequency visuals overemphasized, cant say much about the middles cause they are mostly hidden behind. This is mostly visible with non-EDM music where it is now hard to make out the peaks (beat).

Thomas Vincent (vrince) wrote :

I don't know exactly waveform version is handled but I think removing ~/.mixxx/analysis folder could be a good idea too. Perhaps RJ can give us more information here.

RJ Skerry-Ryan (rryan) on 2013-05-09
Changed in mixxx:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers