Wireless blocks USB-Audio, leading to dropouts

Bug #579117 reported by Chris Hermansen
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

This problem did not occur in 9.10 or earlier. It may be related to bug 388081.

I have an external USB audio device, a HeadRoom Total BitHead.

In 10.04, while playing music with Rhythmbox, every few minutes the sound will stutter. I have added the following line to /etc/pulse/client.conf:

extra-arguments = --log-target=syslog --realtime=1 --high-priority=1 --fail=1 -vvv

which makes messages like the following appear in the system messages file:

May 11 19:40:42 temuko pulseaudio[1311]: alsa-sink.c: Underrun!
May 11 19:40:42 temuko pulseaudio[1311]: alsa-sink.c: Increasing wakeup watermark to 30.00 ms
May 11 19:40:42 temuko pulseaudio[1311]: alsa-sink.c: Underrun!
May 11 19:40:42 temuko pulseaudio[1311]: alsa-sink.c: Increasing wakeup watermark to 34.99 ms

I also added my userid to the pulse groups and the sound group (this plus the extra arguments noted above are suggested as a workaround but do not seem to have much effect).

I have also tried the following in /etc/pulse/default.pa:

load-module module-udev-detect tsched=0

which again seems to have no effect, although it may be there is a subtle difference in the stutter sound.

There is another sound card (built in) that I could try if that seems like a useful experiment. I would also be willing to try other debugging.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: alsa-base 1.0.22.1+dfsg-0ubuntu3
ProcVersionSignature: Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic i686
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC2: clh 1311 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'IXP'/'ATI IXP rev 0 with ALC250 at 0xd0004400, irq 17'
   Mixer name : 'Realtek ALC250 rev 2'
   Components : 'AC97a:414c4752'
   Controls : 33
   Simple ctrls : 21
Card1.Amixer.info:
 Card hw:1 'Modem'/'ATI IXP Modem rev 1 at 0xd0004800, irq 17'
   Mixer name : 'Silicon Laboratory Si3036,8 rev 7'
   Components : 'AC97m:53494c27'
   Controls : 3
   Simple ctrls : 3
Card1.Codecs.codec97.0.mc97.1.1:
 1-1/0: Silicon Laboratory Si3036,8 rev 7

 Extended modem ID: codec=1 LIN1
 Modem status : GPIO MREF ADC1 DAC1 PRE(ADC2) PRF(DAC2) PRG(HADC) PRH(HDAC)
 Line1 rate : 8000Hz
Card2.Amixer.info:
 Card hw:2 'default'/'Burr-Brown from TI USB Audio CODEC at usb-0000:00:13.0-3, full s'
   Mixer name : 'USB Mixer'
   Components : 'USB08bb:2900'
   Controls : 2
   Simple ctrls : 1
Date: Tue May 11 21:09:57 2010
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
PackageArchitecture: all
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_CA.utf8
 SHELL=/bin/bash
SelectedCard: 2 default USB-Audio - USB Audio CODEC
SourcePackage: alsa-driver
Symptom: audio
Title: [USB-Audio - USB Audio CODEC ] Playback problem
dmi.bios.date: 11/16/2004
dmi.bios.vendor: TOSHIBA
dmi.bios.version: V1.50
dmi.board.name: EDW10
dmi.board.vendor: TOSHIBA
dmi.board.version: Null
dmi.chassis.asset.tag: *
dmi.chassis.type: 10
dmi.chassis.vendor: TOSHIBA
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnTOSHIBA:bvrV1.50:bd11/16/2004:svnTOSHIBA:pnSatelliteA70:pvrPSA70C-RX100E:rvnTOSHIBA:rnEDW10:rvrNull:cvnTOSHIBA:ct10:cvrN/A:
dmi.product.name: Satellite A70
dmi.product.version: PSA70C-RX100E
dmi.sys.vendor: TOSHIBA

Revision history for this message
Chris Hermansen (c-hermansen) wrote :
Revision history for this message
David Henningsson (diwic) wrote :

This might be a long shot, but you have these messages in your dmesg:

[ 5391.021483] ath5k phy0: noise floor calibration timeout (2412MHz)
[ 5391.733816] ath5k phy0: noise floor calibration timeout (2417MHz)
[ 5392.445978] ath5k phy0: noise floor calibration timeout (2422MHz)
[ 5511.022563] ath5k phy0: noise floor calibration timeout (2412MHz)
[ 5511.751707] ath5k phy0: noise floor calibration timeout (2417MHz)

Could it be that the wireless blocks the kernel for a long time, and that is causing the underruns? If you disable the wireless, does that make the dmesg messages and/or the underruns go away?

Changed in alsa-driver (Ubuntu):
status: New → Incomplete
Revision history for this message
Chris Hermansen (c-hermansen) wrote :

You may be onto something here.

I do have those messages in my syslog.

I disabled wireless (hmm the switch doesn't seem to work anymore! I wonder how long that's been the case? had to use ifconfig). Then I started rhythmbox, played two songs, with no stutter and no underrun messages.

Then I quit rhythmbox, started wireless with ifconfig, connected to my router, started rhythmbox again, played the same first song - and got two stutters in the the middle.

However, I did not see any fresh noise floor messages, or any other intervening ath5k messages. They may be around somewhere, though. I find ath5k messages in syslog, some alsa messages there too, others in messages.

Anyway, should I try something else?

Revision history for this message
David Henningsson (diwic) wrote :

Good question. Perhaps you can verify that the wireless actually blocks everything for a long while, by running latencytop (install the latencytop package) and see if you get something that confirms the suspicion?

summary: - [USB-Audio - USB Audio CODEC ] Playback problem
+ Wireless blocks USB-Audio, leading to dropouts
affects: alsa-driver (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Chris Hermansen (c-hermansen) wrote : Re: [Bug 579117] Re: [USB-Audio - USB Audio CODEC ] Playback problem

Ok, so I installed latencytop and ran it. I've been looking at the
display but can't see anything specifically related to wireless
operation.

I started up rhythmbox and ran it, waiting for an underrun. When I got
one, I waited a few seconds and then hit refresh on the latencytop
display, then freeze.

On the left hand side under Global I see

events/1 194.3
kondemand/0 167.5
kondemand/1 102.4
jbd2/sda1-8 57.9
compiz 39.8
events/0 33.1
sundaemon 22.1
Xorg 18.1

everything else is below 10.0 ms.

Looking under events/1, I see that the cause is Scheduler: waiting for
CPU. Same for the two kondemand's. The jdb2/sda1-8 is doing three
things: writing buffer to disk (synchronous) which takes 57.9 ms,
Scheduler: waiting for cpu which is 2.1 ms, and writing a page to disk
which is 0.7 ms.

Well, none of that speaks to me, so I've un-frozen latencytop and now
I'm watching it refresh itself every 30s.

Generally under global I see that the scheduler: waiting for cpu is at
the top. I just saw an underrun in the log now I'm waiting for a
refresh.

Under global there is:

Scheduler: waiting for cpu 274.5 ms
Waiting for TTY data 268.0 ms
opening cdrom device 6.8 ms

everything else well below 10ms.

There is a waiting for event (poll) at 5.0ms that is 36.1%. Is that
meaningful?

The backtrace for that is
poll_schedule_timeout
do_poll
do_sys_poll
sys_poll
syscall_call

When I go back down into the detailed targets under Global, I can't find
any big percentage used up by any one of the targets.

I'm a bit stuck here, any more specific advice? Please and thanks!

On Sun, 2010-05-16 at 11:25 +0000, David Henningsson wrote:

> Good question. Perhaps you can verify that the wireless actually blocks
> everything for a long while, by running latencytop (install the
> latencytop package) and see if you get something that confirms the
> suspicion?
>
>
> ** Summary changed:
>
> - [USB-Audio - USB Audio CODEC ] Playback problem
> + Wireless blocks USB-Audio, leading to dropouts
>
> ** Package changed: alsa-driver (Ubuntu) => linux (Ubuntu)
>
> ** Changed in: linux (Ubuntu)
> Importance: Undecided => Medium
>

--
Regards,

Chris Hermansen · mailto:<email address hidden>

Revision history for this message
David Henningsson (diwic) wrote :

Oh, Latencytop has got a new UI for Lucid, nice :-) Although it is still missing decent documentation :-(

Anyway, I guess the most important thing would be to look at the "pulseaudio" process after an underrun and see if anything seems strange there, or if you get anything above 20 ms.

Revision history for this message
Chris Hermansen (c-hermansen) wrote : Re: [Bug 579117] Re: Wireless blocks USB-Audio, leading to dropouts

Now that I've watched the messages file enough, I see there's a pattern;
I get several successive underruns fairly quickly with the "wakeup
watermark" increased from 30ms through 60ms, then things level off for
awhile.

Anyway, after the first underrun, here's what I see in pulseaudio:

scheduler: waiting for cpu 19.8ms 2.5%
userspace lock contention 4.0ms 0.1%
waiting for event (poll) 1.5ms 0.0%

ok I've unfrozen latencytop and now I'm waiting for another underrun...

ok there's one, now we have given up on increasing wakeup watermark and
we are increasing minimal latency to 1.00 ms

here's what I see now in pulseaudio:

scheduler: waiting for cpu 2.6ms 7.9%
waiting for event (poll) 2.1ms 0.1%

another unfreeze, another underrun where minimal latency is increased to
2.00ms

in pulseaudio, now we only have

scheduler: waiting for cpu 2.3ms 11.2%

ok no new underruns, but the last automatic refresh in latencytop showed
a big value - 22.5 ms - in Scheduler: waiting for cpu... hmmmm

and then there was an underrun

now as I watch, I see typically 3.0ms...

another underrun, the max is showing at 4.6ms... now it's 1.9ms

I've had 10 underruns since starting rhythmbox up 20 minutes ago.

Does any of this help?

On Wed, 2010-05-19 at 00:52 +0000, David Henningsson wrote:

> Oh, Latencytop has got a new UI for Lucid, nice :-) Although it is still
> missing decent documentation :-(
>
> Anyway, I guess the most important thing would be to look at the
> "pulseaudio" process after an underrun and see if anything seems strange
> there, or if you get anything above 20 ms.
>

--
Regards,

Chris Hermansen · mailto:<email address hidden>

Revision history for this message
David Henningsson (diwic) wrote :

I'm afraid I don't know much more than you do at this point. Theoretically, if the watermark is 60 ms, anything below approx 50 ms shouldn't cause an underrun.

Btw, are the "calibration timeout" and the "underrun" messages synchronized? You should be able to see both if you look in /var/log/syslog.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

David, /var/log/syslog contains the calibration timeout
messages; /var/log/messages contains the Underrun! messages... but
anyway, that's what "sort" is for...

In answer to your question, they are often synchronized:

Starting with a sample at 07:54 yesterday:

the 1st Underrun happens at 07:54:11, same time as a calibration
timeout.
the 2nd Underrun happens at 18:23:44, some 4 minutes after the last
calibration timeout and 1.5 minutes before the next one
the 3rd Underrun happens at 18:25:03, same time as a calibration timeout
the 4th Underrun and 5th happen at 18:25:04, same time as a calibration
timeout
the 6th Underrun happens at 18:26:43, same time as a calibration timeout
the 7th Underrun happens at 18:28:43, same time as a calibration timeout
the 8th Underrun happens at 18:28:44, same time as two calibration
timeouts

Hmm maybe I should have said they are most often synchronized!

On Wed, 2010-05-19 at 06:13 +0000, David Henningsson wrote:

> I'm afraid I don't know much more than you do at this point.
> Theoretically, if the watermark is 60 ms, anything below approx 50 ms
> shouldn't cause an underrun.
>
> Btw, are the "calibration timeout" and the "underrun" messages
> synchronized? You should be able to see both if you look in
> /var/log/syslog.
>

--
Regards,

Chris Hermansen · mailto:<email address hidden>

Revision history for this message
Chris Hermansen (c-hermansen) wrote :
  • log.txt Edit (17.0 KiB, text/plain; name="log.txt"; charset="UTF-8")

In case it's useful, I attach a merge of syslog and messages (sort -u)
that covers from just before a startup of Rhythmbox, through a few
underruns, to terminating Rhythmbox partway through the first song.

On Thu, 2010-05-20 at 02:14 +0000, Chris Hermansen wrote:

> David, /var/log/syslog contains the calibration timeout
> messages; /var/log/messages contains the Underrun! messages... but
> anyway, that's what "sort" is for...
>
> In answer to your question, they are often synchronized:
>
> Starting with a sample at 07:54 yesterday:
>
> the 1st Underrun happens at 07:54:11, same time as a calibration
> timeout.
> the 2nd Underrun happens at 18:23:44, some 4 minutes after the last
> calibration timeout and 1.5 minutes before the next one
> the 3rd Underrun happens at 18:25:03, same time as a calibration timeout
> the 4th Underrun and 5th happen at 18:25:04, same time as a calibration
> timeout
> the 6th Underrun happens at 18:26:43, same time as a calibration timeout
> the 7th Underrun happens at 18:28:43, same time as a calibration timeout
> the 8th Underrun happens at 18:28:44, same time as two calibration
> timeouts
>
> Hmm maybe I should have said they are most often synchronized!
>
> On Wed, 2010-05-19 at 06:13 +0000, David Henningsson wrote:
>
> > I'm afraid I don't know much more than you do at this point.
> > Theoretically, if the watermark is 60 ms, anything below approx 50 ms
> > shouldn't cause an underrun.
> >
> > Btw, are the "calibration timeout" and the "underrun" messages
> > synchronized? You should be able to see both if you look in
> > /var/log/syslog.
> >
>
>
> --
> Regards,
>
> Chris Hermansen · mailto:<email address hidden>
>

--
Regards,

Chris Hermansen · mailto:<email address hidden>

Revision history for this message
David Henningsson (diwic) wrote :

First, the logging does not make sense to me; in the default configuration (/etc/rsyslog.d/50-default.conf) says that what goes to /var/log/messages is a subset of what goes to /var/log/syslog.

I feel somewhat out on deep water here (I should really learn more about this), but there is a test called "cyclictest", mostly used for real-time kernels, see https://rt.wiki.kernel.org/index.php/Worstcase_Latency_Test_Scenario
I have never used that test myself, but I believe it will show you some "spikes" in latency, and it would be interesting to know the maximum values of these.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Ok I have tried out cyclictest, here are the results.

I used the version from the repositories, run with the command:

sudo time cyclictest --histogram=1000000 > histogram.txt

Prior to starting cyclictest, I opened the log file viewer and went to the end of the "messages". Then I started cyclictest, then Rythmbox; then I started a song playing.

174 seconds (by the messages time stamp) into the song, there was an Underrun! message. I quit out of Rhythmbox, then quit out of cyclictest.

There is considerable output in histogram.txt :-) :-) so I deleted the run messages in front of the histogram, then all histogram buckets with a zero count. There are still 12876 lines in the reduced histogram.txt file. I took the same file and sorted it in decreasing number of counts, and I've attached that.

You can see there are a lot of hits in the 999.999 bucket (that's one second, I think?) - that seems weird to me. Over and above that, there are a significant number of hits in the 944.013 bucket (that's 0,944 seconds, I think?). And lots of other high numbers.

Does this help?

Revision history for this message
David Henningsson (diwic) wrote :

Right, your results look way ahead of mine, I was listening to webradio for a minute (over wired network) and got a max latency of 3409 - 3,4 ms, i e nothing in any bucket above that. To conclude that it is the wireless causing the problem, could you try the same test without the wireless activated and see if the results look better?

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

I certainly can; I hope to get to that tomorrow evening (GMT-8 right
now).

On Tue, 2010-05-25 at 01:52 +0000, David Henningsson wrote:

> Right, your results look way ahead of mine, I was listening to webradio
> for a minute (over wired network) and got a max latency of 3409 - 3,4
> ms, i e nothing in any bucket above that. To conclude that it is the
> wireless causing the problem, could you try the same test without the
> wireless activated and see if the results look better?
>

--
Regards,

Chris Hermansen · mailto:<email address hidden>

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Ok no one is in a hurry to get to bed here, so I tried it.

I disabled wireless by using the "disconnect" in NetworkManager applet; then I used sudo ifconfig wlan0 down to disable it.

I don't have a wired connection, though I notice eth0 is up as well. Anyway, I didn't touch it.

So, the results!

There are still some big latencies! I don't know if this is actually much of an improvement, though I guess it's a bit better in terms of how many larger latencies there are.

I've attached the histogram, sorted once again by decreasing # of instances.

For reference, the cyclictest command I used was:

sudo time cyclictest --histogram=1000000 > his_without_wireless.txt

I played about 8 minutes of music with no underruns.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

David, just a thought here, but please remember this is a USB audio device. Does that mean anything, in terms of latency issues?

Revision history for this message
David Henningsson (diwic) wrote :

Hmm...is there a way we can make those high latencies go away? I mean, if the wireless is down, usb-audio is unplugged, no music playing, the computer is just idle for a minute - are there still high latencies? Given you find such a state, it would be easier for you to try to find what is causing the latencies.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Ok I made them go away a bit.

Clicked on "disconnect" in NetworkManager. ifconfig down on wlan0 and eth0 interfaces, leaving only lo0. Started up a terminal, nothing else (save what Ubuntu starts up on its own). Ran the same cyclictest as above.

Here are the first few lines of the histogram, sorted in decreasing order of hits:

# Histogram
107011 008137
058011 007286
036011 007205
073011 005618
047011 004095
007011 004067
071011 004026
087011 003693
107012 003457
103011 003336
032011 002941
030011 002642
038011 002626
022011 002561
075011 002515
058012 002502
006011 002399
036012 002334
051011 002294

Conversely, here are the worst delays and their counts:

126012 000076
126013 000027
126014 000009
126016 000005
126017 000001
126018 000002
126020 000001
126021 000002
126022 000001
126023 000001
126027 000002
126033 000001
126057 000001
126101 000001
126221 000001
126303 000001
126353 000001
126940 000001
# Total: 000259446
# Max Latency: 126940 / 1000000

So, not sure what to think about this, 127ms still seems like a fairly large latency.

Does this help - my laptop is a hmm five year old Toshiba with a dual-core 3.2Ghz P-4 and 1Gb of memory, a 5400rpm 120Gb EIDE hard drive, Phoenix BIOS. Maybe a 127ms latency is "normal" on one of these babies! Though it sounds like a big latency to me.

Revision history for this message
David Henningsson (diwic) wrote :

It sound like a big latency to me as well. You could be suffering from SMI (see http://en.wikipedia.org/wiki/System_Management_Mode ) in which case there is not that much one can do about it (try updating BIOS). I'm still new at this and learn as I go, so I'm not sure if 127 ms is something that a SMI could cause.

Latencytop doesn't say anything about where the 127 ms comes from?

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

So I decided to run latencytop here while not doing much. The wireless is on and I'm reading your e-mail. After the first refresh, here's what I see under "Global":

Scheduler: waiting for cpu 255.7ms
fsync() on a file (type 'F' for details) 29.3ms
Opening file 5.5ms

others are smaller.

Seems that's a lot of "waiting for cpu"... that was all in a process called kondemand/0

Letting it run for awhile longer...

Scheduler: waiting for cpu 289.2 ms, also from kondemand/0

Awhile longer, and it's jdb2/sda1-8 atg the top with 121.1 ms
longer still, and its firefox-bin at the top with 56.2 ms (in an fsync())
longer still and another firefox-bin with 46.9 in an fsync()
now kondemeand/0 is back at the top with 307.7ms in scheduler: waiting for cpu
now jdb2/sda1-8 at the top with 58.6, 44.1 of were in fsync()
back to firefox at 46.2 in fsync()
firefox again at 70.1 in fsync()

So just looking at the "causes", it seems that thw two biggies are

scheduler: waiting for cpu
fsync() on a file

fsync often in the 40-80 range
schedule from low tens to in the hundreds

hmm just see an ACPI hardware access at 38.6 and reading ext3 directory htree at 27.

All else seems to be below 10ms.

Does this help?

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

David, I was doing some further digging on the "scheduler: waiting for cpu" thing. There is this thread that is interesting:

https://patchwork.kernel.org/patch/52509/

where it seems others are seeing high-ish latencies in relation to this. I tried rebooting with the nohz=off but I saw no appreciable difference in the latencies. I'm completely reluctant to try to follow the kernel patching thing "on the off chance".

There also seem to be comments about big latencies with kondemand in relation to power management.

And I wonder if power management / ACPI might get into the problem here as most things ACPI-related don't seem to work very well on this Toshiba (eg suspend to RAM, and as I mentioned above my switch no longer seems to turn off the wireless network).

Does this just seem like a bunch of red herring?

Thanks for all your patience and thought on this problem.

I'm checking on the idea of a bios upgrade.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

On the bios upgrade: my bios is 1.5 which seems to be the latest on Toshiba's website. So I guess there's nothing there.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Three more things to add.

First, I have a very similar, slightly older laptop than this one. Same model number etc but about an 8 month earlier build with a slightly slower processor. It's a kind of "spare". It has 10.04 on it, but otherwise nothing much in the way of installed stuff. So I installed latencytop and ran it; very similar (long) latencies on it.

Second, I disconnected the USB audio card, activated the internal sound card, played some music, and a few minutes later got an underrun. So it's not related to the USB external sound card thing. Here's syslog around one of those. The shame of an underrun on Bob Marley.

May 29 15:09:16 temuko pulseaudio[1261]: alsa-sink.c: Rewound 8796 bytes.
May 29 15:09:16 temuko pulseaudio[1261]: sink.c: Processing rewind...
May 29 15:09:16 temuko pulseaudio[1261]: sink-input.c: Have to rewind 8796 bytes on render memblockq.
May 29 15:09:16 temuko pulseaudio[1261]: sink-input.c: Have to rewind 8796 bytes on implementor.
May 29 15:09:16 temuko pulseaudio[1261]: source.c: Processing rewind...
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: Wakeup from ALSA!
May 29 15:09:17 temuko kernel: [ 172.142568] ath5k phy0: noise floor calibration timeout (2422MHz)
May 29 15:09:17 temuko pulseaudio[1261]: alsa-source.c: Wakeup from ALSA!
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: Underrun!
May 29 15:09:17 temuko pulseaudio[1261]: alsa-source.c: Overrun!
May 29 15:09:17 temuko pulseaudio[1261]: alsa-source.c: Increasing minimal latency to 16.00 ms
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: Increasing wakeup watermark to 50.00 ms
May 29 15:09:17 temuko pulseaudio[1261]: alsa-source.c: latency set to 20.00ms
May 29 15:09:17 temuko pulseaudio[1261]: alsa-source.c: hwbuf_unused=62008
May 29 15:09:17 temuko pulseaudio[1261]: alsa-source.c: setting avail_min=442
May 29 15:09:17 temuko pulseaudio[1261]: protocol-native.c: Underrun on ''Exodus' by 'Bob Marley & the Wailers'', 0 bytes in queue.
May 29 15:09:17 temuko pulseaudio[1261]: protocol-native.c: Requesting rewind due to rewrite.
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: Requested to rewind 42612 bytes.
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: Limited to 5832 bytes.
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: before: 1458
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: after: 1458
May 29 15:09:17 temuko pulseaudio[1261]: alsa-sink.c: Rewound 5832 bytes.
May 29 15:09:17 temuko pulseaudio[1261]: sink.c: Processing rewind...

Third, looking at the messages file, thinking of that ath5k involvement, look at the following messages. Does this indicate some fundamental problem that might be making the ath5k interfere somehow?

May 23 10:03:55 temuko kernel: [ 2616.230801] WARNING: at /build/buildd/linux-2.6.32/drivers/net/wireless/ath/ath5k/base.c:1142 ath5k_tasklet_rx+0x53b/0x5
a0 [ath5k]()
May 23 10:03:55 temuko kernel: [ 2616.230801] Hardware name: Satellite A70
May 23 10:03:55 temuko kernel: [ 2616.230801] invalid hw_rix: 1a

Revision history for this message
David Henningsson (diwic) wrote :

There is another way we could try. You began this entire thread by saying that this did not occur in 9.10. Can you verify that with latencytop and/or cyclictest, what are the maximum latencies there?

Assuming 9.10 latencies are way lower, it is possible to try to find the exact kernel patch responsible for this problem.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Ok so fresh hard drive fresh 9.10 install with latencytop running alongside Bob Marley.

Here's what I see.

This is the original disk that came with the laptop, a 100gb 4400 rpm drive; it's pretty slow.

I see two causes with higher latency: fsync() on a file, and scheduler: waiting for cpu

fsync() on a file just reported 67.6ms, but I have seen it over 1 second.
Scheduler: waiting for cpu just reported 34.1ms but I have seen it bulge to 300+ when one of those darn ath5k phy0: noise floor calibration timeout (2412MHz) messages come up.

I have heard no stuttering. There are no "underrun" messages in syslog or messages, mind you I haven't turned on the verbose mode on the pulseaudio daemon.

ok there was another ath5k message and a 273.1ms Scheduler: waiting for CPU.

now scheduler is 38.9ms.

now scheduler is 46.6ms

now scheduler is 44.2ms

So I am concluding that the latencies are lower except when the ath5k noise floor thing happens. And generally I'm not hearing / seeing evidence of underruns.

David, it's convenient for me to do other testing in both versions of Ubuntu, so if you have any other ideas, please let me know.

Thanks!

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Chris Hermansen (c-hermansen) wrote :

I have been carrying out tests as per David H.'s instructions but we have not found any workarounds nor fixes. I don't believe David is waiting for more info so I marked it as "new"; sorry if that's not right.

Changed in linux (Ubuntu):
status: Expired → New
Revision history for this message
Chris Hermansen (c-hermansen) wrote :

I feel stuck at this point. I would like to test some more but I don't really have any good ideas.

Is there some kind of pulse audio reconfiguration that I should try?

Should I try to remove pulse audio?

Should I try a fresh install of Kubuntu 10.04 on my spare hard drive to see if it causes problems?

Revision history for this message
David Henningsson (diwic) wrote :

Well, I'm also out of really good ideas, but if you have the time, I guess you just have to continue investigating and trying combinations of what causes high latencies and what don't. E g trying Karmic with a different kernel, trying a preemptive kernel, a mainline kernel, etc.
If you find one kernel that works and one kernel that doesn't (and nothing else is changed), you can continue using "git bisect" to find the exact commit causing the error. I'm quite certain the error is kernel related.

As for pulseaudio, if tsched=0 does not help, I don't think anything else will - but don't take my word for it, feel free to try whatever you want.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Here's an interesting thing!!!!

I was reading over "First Steps" on the pulseaudio website, and I
noticed the pulseaudio command "play-file".

I fooled around a bit trying to kill pulseaudio and start it up from the
command line with pulseaudio -nC but it seems some system service is
restarting it automagically.

Aaaaaaaaanyway, then I noticed the use of "pacmd" to connect to the
daemon, so I tried that, and sure enough, connected to the daemon.

Then I copied a couple of .flac files from my Music directory to my home
directory as "test.flac" and "test2.flac", and I used "play-file" to
play them.

And I could not get any stuttering!!!

Thinking maybe it was something to do with restarting the daemon a few
times, I started up Exaile and played a file; it stuttered within a
minute or so of the beginning of the song.

Am I onto something here?

BTW I am back to a vanilla pulseaudio configuration.

On Sat, 2010-06-19 at 06:11 +0000, David Henningsson wrote:

> Well, I'm also out of really good ideas, but if you have the time, I guess you just have to continue investigating and trying combinations of what causes high latencies and what don't. E g trying Karmic with a different kernel, trying a preemptive kernel, a mainline kernel, etc.
> If you find one kernel that works and one kernel that doesn't (and nothing else is changed), you can continue using "git bisect" to find the exact commit causing the error. I'm quite certain the error is kernel related.
>
> As for pulseaudio, if tsched=0 does not help, I don't think anything
> else will - but don't take my word for it, feel free to try whatever you
> want.
>

--
Regards,

Chris Hermansen · mailto:<email address hidden>

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Ok, more experimentation with play-file in pacmd, and there is no stuttering. Doesn't seem to be affected if I am running e-mail, web browser, etc simultaneously.

I've jumped back to Exaile to play the same music. Sure enough, some ath5k messages and a stutter.

I see a message cluster in syslog I haven't noticed before during this test; here it is:

 [ 8842.443712] ath5k phy0: noise floor calibration timeout (2422MHz)
Jun 22 21:12:56 temuko pulseaudio[1267]: alsa-sink.c: ALSA woke us up to write new data to the device, but there was actually nothing to write!
Jun 22 21:12:56 temuko pulseaudio[1267]: alsa-sink.c: Most likely this is a bug in the ALSA driver 'snd_usb_audio'. Please report this issue to the ALSA developers.
Jun 22 21:12:56 temuko pulseaudio[1267]: alsa-sink.c: We were woken up with POLLOUT set -- however a subsequent snd_pcm_avail() returned 0 or another value < min_avail.

Any clues from this?

Revision history for this message
David Henningsson (diwic) wrote :

Clues? No, rather the opposite...

It sounds like you're getting an underrun on the front end of PA (the application does not supply PA with sound data quickly enough), but that should have showed up in the PA log as "0 bytes on memblock renderq" or something like that (don't remember the exact wording).

Underruns on the backend (PA does not supply the driver with sound data quickly enough) show up as "alsa-sink: Underrun!" in the PA log, and that was what you had, right?

But backend underruns should not depend on what application is supplying PA with data, so that does not make sense.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Tonight I had a few minutes to experiment.

I decided to try VLC as it is a different kind of audio client, lots of things to tweak, and it knows about pulse audio.

So I installed it, and picked a directory, and started playing it. Below is the log file info related to a stutter. Note the first audible stutter occurred where the pulseaudio messages happened. Since then I have had stutters that seem only to be linked to the ath5k message. See pulseaudio complaining about being woken up for no reason (I get that, happens to dads, too).

Aaaaaaaaanyway... David, is there something I can do to get rid of this ath5k problem, since it seems to be so linked to the whole mess?

Jun 30 19:24:32 temuko kernel: [ 1901.549765] ath5k phy0: noise floor calibration timeout (2412MHz)
Jun 30 19:24:33 temuko kernel: [ 1902.270582] ath5k phy0: noise floor calibration timeout (2417MHz)
Jun 30 19:24:33 temuko kernel: [ 1902.989994] ath5k phy0: noise floor calibration timeout (2422MHz)
Jun 30 19:26:32 temuko pulseaudio[1298]: alsa-sink.c: ALSA woke us up to write new data to the device, but there was actually nothing to write!
Jun 30 19:26:32 temuko kernel: [ 2021.549973] ath5k phy0: noise floor calibration timeout (2412MHz)
Jun 30 19:26:32 temuko pulseaudio[1298]: alsa-sink.c: Most likely this is a bug in the ALSA driver 'snd_usb_audio'. Please report this issue to the ALSA developers.
Jun 30 19:26:32 temuko pulseaudio[1298]: alsa-sink.c: We were woken up with POLLOUT set -- however a subsequent snd_pcm_avail() returned 0 or another value < min_avail.
Jun 30 19:26:33 temuko kernel: [ 2022.274829] ath5k phy0: noise floor calibration timeout (2417MHz)
Jun 30 19:26:33 temuko kernel: [ 2022.629104] ath5k phy0: noise floor calibration timeout (2417MHz)
Jun 30 19:26:34 temuko kernel: [ 2023.294463] ath5k phy0: noise floor calibration timeout (2422MHz)

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Today's experiment - try it with Kubuntu.

I still get noise floor calibration timeout messages.

However, I have listened to 10 songs or so now and no skipping. Also no alsa messages in the log.

Watching latencytop run, I still see the big latencies for "Scheduler: waiting for cpu" from time to time (317ms just now; three refreshes later with nothing greater than 50ms, there was just a 266ms). But no skipping/stuttering, and still no error messages in dmesg.

David, if you're still watching... above you mention trying a different kernel. Is this just a matter of using Synaptic to get a different kernel from the repos (like the rt kernel)? Or is there more to it than that?

Is there any way to try it without pulseaudio to see if the problem appears to go away?

Revision history for this message
David Henningsson (diwic) wrote :

I'm still watching, but I lack both time and ideas. Kubuntu 10.04 does not run pulseaudio so that could be the reason for differences in behavior. (Kubuntu 10.10 will run Pulseaudio, btw.)
That there are no "noise floor calibration timeout"s under kubuntu seems interesting though. I'm not sure what it is that Kubuntu does not do in that case.
This is a case when there is hard to know the real root cause and what messages are just other victims of the latency, it could be that the calibration timeouts are not the real causes.

As for trying different kernels, I think it would be interesting to try linux-preempt (available in Lucid, use Synaptic to get it) and see if it makes a difference.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

I see in reading your response I have mislead you slightly - the Kubuntu test DID SHOW noise floor calibration timeouts; just no (apparent) disturbance of alsa / pulse.

I do not see a linux-preempt in lucid synaptic... I changed to the main server in case there was something weird there... but no preempt. Is there some other repo I need to enable? According to the wiki page on this kernel, it should be there...

So I tried linux-rt. This is a bit older kernel 2.6.31-11-rt. The machine booted ok; I logged in, started the log viewer, started rhythmbox (which crashed, uh-oh). So I then started Exaile (which worked) and started playing some music.

In the log file: no messages whatsoever about noise floor calibration timeouts. On about the 5th song, I saw this:

Jul 4 09:58:34 temuko pulseaudio[4344]: alsa-sink.c: ALSA woke us up to write new data to the device, but there was actually nothing to write!
Jul 4 09:58:34 temuko pulseaudio[4344]: alsa-sink.c: Most likely this is a bug in the ALSA driver 'snd_usb_audio'. Please report this issue to the ALSA developers.
Jul 4 09:58:34 temuko pulseaudio[4344]: alsa-sink.c: We were woken up with POLLOUT set -- however a subsequent snd_pcm_avail() returned 0 or another value < min_avail.

Watching latencytop, I get no long latencies for waiting for cpu; they're all in the 25-35ms range now. I started up a bunch of stuff (web browser, etc) and got some longer latencies then (writing a page to disk 232ms, reading from file 139ms, page fault 102ms, fsync on a file 100ms).

Revision history for this message
David Henningsson (diwic) wrote :

Linux-preempt should be in the main repositories, and I have it here. Never tried it though. The rt kernel is not officially supported by the kernel team and is known to be unstable on some machines.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

hmm. maybe it's only an amd64 kernel? That's what I see at

http://packages.ubuntu.com/hu/lucid/linux-image-preempt

In any case, when I try apt-get here are the results:

clh@temuko:~$ sudo apt-get install linux-preempt
[sudo] password for clh:
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Couldn't find package linux-preempt
clh@temuko:~$

This is with all Ubuntu Software enabled (Canonical-supported, Community-maintained, Proprietary drivers, Software restricted) except for source code, and downloading from Main server.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Sorry, but one other thought / question.

Since I'm getting an underrun, doesn't that kind of imply that my client is not able to reliably fill pulseaudio's buffers? Maybe I should run the client at a higher priority?

Revision history for this message
David Henningsson (diwic) wrote :

Guess you're right, for some (to me unknown) reason linux-preempt is not built for i386...

As for the underrun question: if you get "alsa-sink.c: Underrun!", that's the alsa driver telling PA it didn't get any buffers from PA in time.
If you get something like "protocol-native.c: Underrun on 'Rythmbox', 0 bytes in queue.", that's PA didn't get any buffers from Rythmbox in time.
Roughly, everything starting with "alsa-sink.c" or "alsa-source.c" in the log is related to PA - Alsa driver communication, and everything else is related to Client - PA communication.
The PA - alsa connection should work even if the client misbehaves, but that's a simplification - in some cases the client can affect the PA - alsa communication by e g specifying what latency it needs.

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Well, renice-ing rhythmbox down to -9 did not seem to help. I got stuttering every two-three minutes. Besides the ath5k noise floor calibration message, the only message I received during the test was

Jul 4 15:37:15 temuko pulseaudio[1273]: ratelimit.c: 848 events suppressed

I don't always get the alsa-sink.c: Underrun! message, but I don't ever recall getting the protocol-native message.

This group of messages which is occasionally but not always reported by pulseaudio also puzzles me:

alsa-sink.c: ALSA woke us up to write new data to the device, but there was actually nothing to write!
alsa-sink.c: Most likely this is a bug in the ALSA driver 'snd_usb_audio'. Please report this issue to the ALSA developers.
alsa-sink.c: We were woken up with POLLOUT set -- however a subsequent snd_pcm_avail() returned 0 or another value < min_avail.

I have to say I'm pretty stuck at this point.

Brad Figg (brad-figg)
tags: added: kj-triage
Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Some new, and I think good, news!

Bug #427439 morphs at end into someone having a problem with "noise floor calibration timeout" messages. He repairs this problem by trying a new kernel, located at http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.34-lucid/

Well now that I am an expert :-) in installing new kernels, I decided to try it too.

First of all, my machine has been up for about 1/2 hour now since the install / reboot, and I see no more "noise floor calibration timeout" messages.

Secondly, I'm into about 25 minutes worth of music playback with no stuttering so far. There was one pulseaudio message displayed shortly after starting up Exaile, which read "pulseaudio[1404]: ratelimit.c: 2 events suppressed", but other than that, no more complaining from pulseaudio, either.

So, 1/2 hour of audio bliss doesn't completely convince me, but with the old kernel I would have had 10 or so stutters by now. I am getting somewhere at least!!!

Revision history for this message
Chris Hermansen (c-hermansen) wrote :

Ok, after some few hours more of listening, I can report that, if not fixed, the problem is diminished by several orders of magnitude at least.

For clarity, the fix is to install the 2.6.34-lucid kernel from the kernel-ppa's.

I could mark this as solved but I don't know if that's appropriate. Any comments?

Revision history for this message
David Henningsson (diwic) wrote :

I haven't seen anyone with similar problem, and you seem happy, so I'm closing the bug. If installing a later kernel works, we can assume it is fixed in Maverick, so marking as "fix released".
Thanks for the cooperation, at least I learned something in the process :-)

Changed in linux (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.