snd-hda-intel: distorted sound after resume, until the module is reloaded
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-source-2.6.20 (Ubuntu) |
Won't Fix
|
Undecided
|
Unassigned | ||
linux-source-2.6.22 (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: linux-source-2.6.20
The ICH7 sound card on this Sony VAIO VGN-FE41E notebook works fine, until it is suspended to RAM and resumed again. When a sound is played, it seems to take about half a second of sound and repeats it lots of times. Then it takes the next half a second and does it again, etc. until the sound is finished.
Unloading/reloading the snd-hda-intel module makes the soundcard work again correctly.
Inspired by bug #55071, I looked at the contents of /proc/asound/
--- card0codec0pin_
+++ card0codec0pin_
@@ -40,7 +40,7 @@
0x09
Node 0x09 [Audio Selector] wcaps 0x300903: Stereo Amp-In
Amp-In caps: N/A
- Amp-In vals: [0x0f 0x0f]
+ Amp-In vals: [0x80 0x80]
Connection: 1
0x15
Node 0x0a [Pin Complex] wcaps 0x400181: Stereo
This happens
wvengen (wvengen) wrote : | #1 |
Changed in linux-source-2.6.20: | |
assignee: | nobody → ubuntu-kernel-team |
status: | Unconfirmed → Confirmed |
I'm hearing the same problem on a Sony Vaio VGN-FE41Z.
I read where you say "Unloading/
$ sudo modprobe -r snd-hda-intel
FATAL: Module snd_hda_intel is in use.
wvengen (wvengen) wrote : | #3 |
What about the volume control? I removed the volume control applet from my panel at the top of the screen; this is no problem since there are special keys for the volume anyway. Running an audio application, or even Firefox with a flash object may also prevent the module from unloading; this is something I haven't solved.
Re: unloading the module. It was in-use even after I stopped GDM and was working from a console.
I've been debugging this issue and have made a little progress, albeit not the kind we want.
The difference in the outputs reported here and in bug #55071 pre- and post-suspend of codex#0 is not of consequence. It represents the stereo-input device amplifier state, where 0x80 is mute. If after resume you use the volume control to alter the Recording->Capture state or level you'll see the change reflected in the Amp-In vals of Node 9.
You can obtain the same value during normal operations by toggling 'audio recording from capture' and reducing the level to zero; Node 9 will show val 0x80.
The user experience of this issue sounds to me like a problem I dealt with back in the 1980's and it is to do with handling of DMA buffers. To my ear at least, our current issue sounds like the DMA buffer isn't being updated when the device exhausts its contents, and the DMA transfer to the device is looping several times from the same buffer. What should happen is the device should raise an interrupt which hda_intel.
Some observations:
The patch in use on these Sony Vaio FE4x series is Sigmatel.
In hda_codec.
if (codec-
codec-
and during resume hda_codec.
if (codec-
codec-
However (this may or may-not be significant) the Sigmatel patch only defines a resume function (checking the other patches they all seem to omit a suspend function). In patch_sigmatel.
#ifdef CONFIG_PM
.resume = stac92xx_resume,
#endif
There is a note in hda_codec.
/* FIXME: should handle power widget capabilities */
and another at hda_intel.c:1056 in struct snd_pcm_hardware azx_pcm_hw.info
/* No full-resume yet implemented */
/* SNDRV_PCM_
Which, according to include/
#define SNDRV_PCM_
When I find some time I am going to build a debug-kernel with hda_intel.c outputting debug info to test if hda_codec.
I decided to track the interrupt/DMA enabling/disabling. I'll put my notes here in case others want to investigate.
I analysed the sequence of operations in hda_intel.
I noticed that azx_suspend() calls snd_hda_suspend() then azx_free_cmd_io() which does:
/* disable ringbuffer DMAs */
azx_writeb(chip, RIRBCTL, 0);
azx_writeb(chip, CORBCTL, 0);
}
I then looked at the corresponding resume operations. Immediately prior to azx_resume() calling snd_hda_resume() it calls
azx_init_chip(). It in turn does:
/* initialize the codec command I/O */
if (!chip->single_cmd)
azx_init_
Which does:
/* enable corb dma */
azx_writeb(chip, CORBCTL, ICH6_RBCTL_DMA_EN);
/* enable rirb dma and response irq */
azx_writeb(chip, RIRBCTL, ICH6_RBCTL_DMA_EN | ICH6_RBCTL_IRQ_EN);
I then looked at the Sigmatel patch. On the Sony Vaio VGN-FE41Z the chipset reports as:
Codec: SigmaTel STAC9872AK
Address: 0
Vendor Id: 0x83847662
Subsystem Id: 0x104d0c00
Revision Id: 0x100201
which corresponds with patch_sigmatel.c struct hda_codec_preset snd_hda_
{ .id = 0x83847662, .name = "STAC9872AK", .patch = patch_stac9872 },
In patch_stac9872() the following code is executed:
switch (board_config) {
case CXD9872RD_VAIO:
case STAC9872AK_VAIO: // *** this is the matching config ***
case STAC9872K_VAIO:
followed by:
codec->patch_ops = stac9872_patch_ops;
stac9872_patch_ops is defined as:
static struct hda_codec_ops stac9872_patch_ops = {
.build_controls = stac92xx_
.build_pcms = stac92xx_
.init = stac92xx_init,
.free = stac92xx_free,
#ifdef CONFIG_PM
.resume = stac92xx_resume,
#endif
};
stac9872_
stac92xx_resume() calls the same function as is used on device init, stac92xx_init() so there doesn't appear to be any difference between cold-start and resume.
wvengen (wvengen) wrote : | #6 |
- int-trace-01.txt Edit (3.4 KiB, text/plain)
Great to see some work on this! I'm not too familiar with DMA, but I'll try to help a bit. I started to trace the DMA transfer process by verifying that the interrupts are correctly setup after resume. It seems this is working: azx_interrupt() is called a number of times proportional to the length of playback. The attached file explains this in more detail.
By the way, the length of playback after resume is not constant for me; a sound of less than a second may take a couple of seconds or may seem to have no end.
Oh cool! Someone else that knows how to hack :)
I've been reading the Intel HDA Programmers Reference Manual (http://
ICH6 HDA/AC'97 Programmers Reference Manual
1.2.6.3 Resume Context Recovery
When the system is placed in a power-down state (S3 or greater) power is removed from the ICH6
AC '97 controller. In this state, DMA registers lose information regarding current position and
pointer values. The device driver must be able to save the context of all DMA engines before
system power lost therefore context for DMA engines. Upon system power resume, the device
driver must restore the DMA Engine register accordingly to saved values prior the suspend event.
4.6.6 Stream Steady State Operation
Once the stream has been started, the hardware will continue fetching data from the Cyclic Buffer
described by the Buffer Descriptor List. For an output stream, software is responsible for making
sure that valid data is present in the buffers before it is fetched by the hardware. For an input
stream, data must be removed from the buffers before being overwritten by hardware.
*** When the hardware has reached the end of the Cyclic Buffer, it will automatically wrap back around to the
beginning of the buffer and continue to process the stream data until the stream is stopped by the
software by clearing the RUN bit. ***
Software can either use interrupts at the end of selected buffers by setting the IOC bit in the BDL
entry or can poll the stream position to determine when to process the stream data.
So it seems to me we should focus our attention on how the BDL and memory allocation is handled, and specifically what is setup prior to a suspend (even when no output is active) that is subsequently used when outputting sound. I suspect what we will discover is that during suspend/resume one or more structures and/or memory relating to Buffer Descriptor Lists or page memory buffers is not being restored, or else the IOC bit isn't being restored or acted upon. Note the text I've highlighted above with *** ... *** which would on the face of it describe the symptoms we are hearing (is the ISR countdow.
We should try to divide our labours: can I suggest that you continue hacking at the kernel, focusing in on the buffer code whilst I continue reading up on HDA and analysing the source-code? For various reasons I'm not keen to be disturbing the kernel right now because of other things I'm working on.
wvengen (wvengen) wrote : | #8 |
Allright, I can see the DMA pointer not being updated. az_interrupt() calls snd_pcm_
I'm getting to the borders of my DMA knowledge, where is posbuf actually updated, because that's the source of the pointer?
Funny how you can arrive at the same point from different directions. I noticed in your tracing that int_cnt had far more interrupts (for the same file) just after resume than when the module had been freshly loaded (389 vs 11) and had traced through to the posbuf from struct azx_dev:
hda_intel.c:257
struct azx_dev {
u32 *posbuf; /* position buffer pointer */
hda_intel.c:334
struct snd_dma_buffer posbuf;
/include/
struct snd_dma_buffer {
struct snd_dma_device dev; /* device type */
unsigned char *area; /* virtual pointer */
dma_addr_t addr; /* physical address */
size_t bytes; /* buffer size in bytes */
void *private_data; /* private for allocator; don't touch */
};
In azx_init_chip():
/* program the position buffer */
azx_writel(chip, DPLBASE, (u32)chip-
azx_writel(chip, DPUBASE, upper_32bit(
and in azx_setup_
/* enable the position buffer */
if (! (azx_readl(chip, DPLBASE) & ICH6_DPLBASE_
azx_writel(chip, DPLBASE, (u32)chip-
Notice how that code is conditional? I've been looking for conditional code because its the only place where something different could be done during resume compared to device start-up. I'm not suggesting this particular code is the culprit but this or something similar is probably the root cause of the issue.
Can you tell me precisely where in azx_pcm_pointer() you are reporting the value of pos, and also which code-path is being executed (there's an if...else in there) ?
Also, could you show a sample of the pos values for the good and bugged scenarios from a printk (kernel times could be useful) ?
hda_intel.c:1392:
azx_init_stream() {
...
azx_dev->posbuf = (u32 __iomem *)(chip-
azx_setup_
snd_printdd(
azx_
azx_setup_
azx_setup_
Now, we know from your earlier trace that azx_pcm_prepare() is being called 4 times and that the code above must be being executed since it comes directly after the snd_printdd() call.
So I think we're safe to focus on the code in and around azx_setup_
Reading this extract from the HDA PRM in conjunction with azx_setup_
4.6.6
After the RUN bit has been set, the buffer described by the BDL should not be changed by the
software. The hardware may pre-fetch and/or cache an arbitrary number of BDL Entries from the
list, so there is no way to ensure when or if any changes to the BDL list would be visible to the
hardware.
*** Even when the RUN bit has been cleared to pause the stream after it has been running,
the hardware may still have pre-fetched descriptors that will not be flushed when the stream is
restarted. Therefore, the software should only modify the BDL before the RUN bit has been set for
the first time after a Stream Reset. ***
/* make sure the run bit is zero for SD */
azx_sd_
/* reset stream */
azx_sd_
udelay(3);
timeout = 300;
while (!((val = azx_sd_
--timeout)
;
val &= ~SD_CTL_
azx_sd_
udelay(3);
timeout = 300;
/* waiting for hardware to report that the stream is out of reset */
while (((val = azx_sd_
--timeout);
wvengen (wvengen) wrote : | #13 |
- posbuf-trace-01.txt Edit (111.1 KiB, text/plain)
Wow, you've replied quite a lot already, here the answers to questions you posed.
posbuf was reported in azx_pcm_pointer(), just after the line
pos = le32_to_
( http://
I had another printk at read_lpib, but that one wasn't triggered at all. One of the higher functions does some alignment (snd_pcm_
You can also see a printk in pcm_lib.c, that's at the end of snd_pcm_
I think azx_dev is an internal structure used to access the different streams. Hardware equivalent is accessible by memory mapped registers (ICH6 paragraph 18.2), beginning at address 80h and 20h bytes long. The macros azx_sd_* are used to access them. You probabely already found this out.
For posbuf values, please see attachment. Before suspend, one can see the values increasing. Output is usually the same (although I have seen a difference sometime I don't understand). After suspend, values are generally different for different suspend/resumes, but equal within one play. There is a point however when the value changes.
Thanks for that report, it was just what I needed. I was already tracking the various code responsible for updating posbuf and your trace has confirmed some of what I was expecting. There are several reasons why posbuf/pos won't be updated so we should try eliminating possibilities.
In snd_pcm_
pos = substream-
if (pos == SNDRV_PCM_POS_XRUN)
return pos; /* XRUN */
Because if the function didn't return at that point it would update pos:
pos -= pos % runtime->min_align;
Using CONFIG_
#ifdef CONFIG_
if (runtime->periods > 1 && substream-
snd_
if (substream-
dump_stack();
}
#endif
assuming that the function didn't return before it got to that point:
pos = snd_pcm_
if (pos == SNDRV_PCM_POS_XRUN) {
xrun(substream);
return -EPIPE;
}
Alternatively, we need to watch snd_pcm_
if (runtime-
goto __next_buf;
new_hw_ptr = runtime-
hw_ptr_interrupt = runtime-
wvengen (wvengen) wrote : | #15 |
- posbuf-trace-02.txt Edit (30.5 KiB, text/plain)
I don't think the line
pos -= pos % runtime->min_align;
updates the position buffer each time. You can see in posbuf-trace-01.txt that runtime-
To verify this, I printed the lpib position (value of SD_LPIB register for the stream) in azx_pcm_pointer(), in addition to azx_dev->posbuf (see attachment). One can see that the lpib values are equal before and after suspend, while the computed posbuf is wrong. This probabely means that the hardware and software have a different idea about the dma buffer.
- pre/post resume pos, hw_ptr, hw_ptr_interrupt Edit (4.6 KiB, text/plain)
Yes, I agree. I found some time yesterday to build the alsa-core and snd-hda modules myself and have been watching the updating of the hw_ptr_interrupt. I've attached the log. Also, the line
pos -= pos % runtime->min_align;
is, if you read it carefully, not intended to update pos but simply to ensure pos is aligned on the correct byte/word/dword boundary. That is why in normal circumstances it doesn't change at that point.
I was comparing the resume operations to initial setup to see if I could spot any hardware actions (such as writel()) that aren't done during resume. So far, I've not found anything.
I also checked where azx_dev->posbuf is set, and it is only in one place and the way the code reads azx_dev->posbuf should always point to the base address so at the moment I've confused myself as to why pos returned from azx_pcm_pointer() advances.
If the DMA harware registers were responsible for updating it the declaration of the structure should be different, or we should find an azx_dev->posbuf = azx_readl(); type construct.
wvengen (wvengen) wrote : | #17 |
A short overview of where posbuf is used in hda_intel.c:
In azx_init_chip():
azx_writel(chip, DPLBASE, (u32)chip-
azx_writel(chip, DPUBASE, upper_32bit(
In azx_setup_
// enable the position buffer
if (! (azx_readl(chip, DPLBASE) & ICH6_DPLBASE_
azx_
In azx_init_stream():
azx_dev->posbuf = (u32 __iomem *)(chip-
In azx_pcm_pointer():
pos = le32_to_
In azx_create():
snd_dma_
So, chip->posbuf is a dma buffer that contains stream pointers (DMA Position Base Address), which are initialised in azx_init_stream(). Addresses must be 128-byte aligned. When the lowest bit is set, "the controller will write the DMA positions of each of the DMA engines to the buffer in the main memory periodically (typically once per frame). Software can use this value to know what data in memory is valid data." (18.2.31 DPLBASE) - it seems that the hardware does report back the position.
Note 1 mentioned in 18.2.6 GCTL struck me (about reset bit): "The CORB/RIRB RUN bits and all stream RUN bits must be verified cleared to 0 before writing a 0 to this bit in order to assure a clean re-start." I don't see this happening in/before azx_reset().
wvengen (wvengen) wrote : | #18 |
- posbuf-trace-03.txt Edit (16.7 KiB, text/plain)
Ok, I tracked DPLBASE and posbuf before and after suspend/resume and they look pretty similar. In both cases, the lowest bit of DPLBASE is set, so it should report back the position. Before:
posbuf 0x340, lpibpos 0x294, posbuf=0x1b881020 DPLBASE=0x1b881001
posbuf 0xfe0, lpibpos 0xf40, posbuf=0x1b881020 DPLBASE=0x1b881001
posbuf 0x1fe0, lpibpos 0x1f40, posbuf=0x1b881020 DPLBASE=0x1b881001
and after:
posbuf 0x340, lpibpos 0x294, posbuf=0x1b881020 DPLBASE=0x1b881001
posbuf 0x340, lpibpos 0xf40, posbuf=0x1b881020 DPLBASE=0x1b881001
posbuf 0x340, lpibpos 0x1f40, posbuf=0x1b881020 DPLBASE=0x1b881001
(another test showed that DPUBASE remains zero) So the issue remains: how is the re-initialisation different on resume?
wvengen (wvengen) wrote : | #19 |
- hda-config-01.txt Edit (118.9 KiB, text/plain)
I hope I'm not spamming this bug with information; as the cause is somewhat unclear, all these things may or may not be important in the end. There's two kinds of config data: pci and memory mapped. PCI can be checked using sysfs,
About memory mapped registers. Find attached a dump of these before and after resume (and also a dump of the stream's BDL). Differences:
* pci 0x044 TCSEL traffic class select
* mem 0x030-0x033 wall clock
* mem 0x048-0x04a reserved
* mem 0x058-0x059 RIRB write pointer (constant during stream, differs before/after resume)
* mem 0x104-0x107 LPIB0 is different within stream, that was known already (differences before/after resume probably due to timing)
I shouldn't have missed anything, but I'm a bit tired at the moment.
And now for the really good news: try this after suspend ...
# cd /sys/bus/
# dd if=config of=/tmp/foo
# printf "\x0" | dd of=/tmp/foo bs=1 seek=68
# dd if=/tmp/foo of=config
and play a sound file ...
Replace "\x0" with "\x7" to go back to the non-working state.
I don't know yet why TCSEL is 7 after resume. This note about TCSEL in the HDA document may have an impact: "These bits are not reset on D3HOT to D0 transition; however, they are reset by PLTRST#." Its default is zero however.
I'm afraid I'm seeing no difference here. I had previously checked TCSEL because of the comment in the head of azx_init_chip(), and I had also made sure that azx_resume() was called, and in-turn was calling azx_init_chip().
I also put a check of the value of TCSEL in azx_init_chip() to see what value it had upon resume, and another at the end of azx_resume() to ensure the reset had been done.
... start-up ...
Jul 11 22:46:03 bagoas kernel: [ 17.736000] hda-intel: Allocated posbuf size=64
Jul 11 22:46:03 bagoas kernel: [ 17.736000] hda-intel: azx_init_chip: TCSEL=7
... Resume ...
Jul 11 22:50:33 bagoas kernel: [ 291.960000] hda-intel: azx_init_chip: TCSEL=7
Jul 11 22:50:33 bagoas kernel: [ 292.024000] hda-intel: azx_resume: TCSEL=0
With TCSEL confirmed as 0 I am still hearing the same problem after resume.
00000060 05 70 80 00 00 00 00 00 00 00 00 00 00 00 00 00 |.p..............|
static void azx_init_
{
unsigned char reg;
/* Clear bits 0-2 of PCI register TCSEL (at offset 0x44)
* TCSEL == Traffic Class Select Register, which sets PCI express QOS
* Ensuring these bits are 0 clears playback static on some HD Audio codecs
*/
pci_read_
pci_write_
#ifdef DEBUG_RESUME
snd_printk(
#endif
static int azx_resume(struct pci_dev *pci)
{
struct snd_card *card = pci_get_
struct azx *chip = card->private_data;
#ifdef DEBUG_RESUME
unsigned char reg;
#endif
...
#ifdef DEBUG_RESUME
pci_read_
snd_printk(
#endif
return 0;
}
wvengen (wvengen) wrote : | #21 |
- tcsel-trace-01.txt Edit (84.1 KiB, text/plain)
Hmm I didn't clearly indicate the base of numbers, but I meant 68d=44h. And the value of TCSEL is correctly put back at resume, but somehow it changes between initialisation and start of playback; see attachment for tracing of TCSEL at various places in hda_intel.c. It is interesting how its value is reset to zero, but pops up as zeven later on:
[ 3.585793] HDA Intel 0000:00:1b.0: resuming
[ 3.601572] ALSA sound/pci/
(...)
[ 3.665477] ALSA sound/pci/
(...)
[ 39.396663] ALSA sound/pci/
If you compare your pci registers before and after resume, do you see any difference?
Sorry - I pasted the wrong line from 'config'!
I see what you mean about TCSEL being reset to 07 when a sound is played, and when I do 'hexedt config' and change 0x44 to 0, the sound is corrected - nice one.
So the question is, is there another chip reset operation between azx_resume() and playback that causes TCSEL to set to 0x07, or is something else accidentally writing values to that location?
wvengen (wvengen) wrote : | #23 |
azx_reset() is only called from azx_init_chip(), to which I added printk's; so I don't think the hda driver does another initialisation. I'm currently writing a kprobe on pci_bus_
I'm relieved that we have the same problem, that keeps it less complicated :)
wvengen (wvengen) wrote : | #24 |
- tcsel-trace-02.txt Edit (5.7 KiB, text/plain)
kprobe results: nothing to see. The module (for source see attachment) just monitors calls to pci_bus_
Hypothesis: a(nother) module or the kernel corrupts the pci configuration of this device. Can a pci configuration write go out of bounds and end up in another device's configuration space?
OK, it seems we have identified the reason for this problem.
Willem and I worked together on tracing the value of TCSEL after resume. The code that finally pinpointed the issue was a kernel timer started in azx_resume() and stopped in azx_pcm_open() with a jiffie interval of 10, which appears to equate to 0.04 seconds. The timer writes the value of TCSEL to kern.log, so we simply look through the log for when the value of TCSEL changed from 0 to 7, and what happened in between the two timer interrupts.
It appears the cause is the 2nd CPU being brought online in the SMP kernel. We are now working on *why* this is - is it a kernel problem or something specific to the Sony Vaio or the ICH7 chipset:
[ 1630.936000] hda-intel: timer: TCSEL=0
[ 1630.944000] Calibrating delay using timer specific routine.. 3990.08 BogoMIPS (lpj=7980178)
[ 1630.944000] CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000e3bd 00000000 00000001
[ 1630.944000] monitor/mwait feature present.
[ 1630.944000] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 1630.944000] CPU: L2 cache: 4096K
[ 1630.944000] CPU: Physical Processor ID: 0
[ 1630.944000] CPU: Processor Core ID: 1
[ 1630.944000] CPU: After all inits, caps: bfebfbff 20100000 00000000 00003940 0000e3bd 00000000 00000001
[ 1630.944000] CPU1: Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz stepping 06
[ 1630.948000] CPU1 is up
[ 1630.952000] sdb : READ CAPACITY failed.
[ 1630.952000] sdb : status=0, message=00, host=7, driver=00
[ 1630.952000] sdb : sense not available.
[ 1630.964000] sdb: Write Protect is off
[ 1630.964000] sdb: Mode Sense: 00 00 00 00
[ 1630.964000] sdb: assuming drive cache: write through
[ 1630.976000] hda-intel: timer: TCSEL=7
wvengen (wvengen) wrote : | #26 |
Here's another log that pinpoints to enabling CPU1 beyond doubt:
[ 9.868250] ALSA sound/pci/
[ 9.908187] ALSA sound/pci/
[ 9.924171] usb 5-8: new high speed USB device using ehci_hcd and address 4
[ 9.948127] ALSA sound/pci/
[ 9.966250] done.
[ 9.966292] Enabling non-boot CPUs ...
[ 9.988065] ALSA sound/pci/
[ 10.028003] ALSA sound/pci/
[ 10.059394] usb 5-8: configuration #1 chosen from 1 choice
this time booted with maxcpus=1
What we should have made clear in the two comments above is that we found by monitoring the PCI config space when the ALSA drivers weren't loaded that the TCSEL value was still being changed to 0x07 from 0x0. That was when we decided to implement a timer to find out when the value was changing.
- Comparison of resume and cold-start configuration order Edit (3.6 KiB, text/plain)
Is it significant that upon resume the 2nd CPU is resumed *after* PCI config has been written back to the device and it has been reinitialised by its driver? When cold-starting the CPU has been started before the PCI device is configured.
See the attached log-file that compares the relevant sections of kern.log for a resume and cold-start.
- patch for hda_intel.c Edit (867 bytes, text/plain)
I've tested the latest Gutsy kernel from git (2.6.22-8.18) and found it doesn't suffer from this bug. In that, the 2nd CPU is started before the PCI devices are resumed. After resume the HDA device PCI config shows the TCSEL register is 0x00.
Therefore I'm attaching a patch against Feisty's 2.6.20-16 kernel source. In sound/pci/
Changed status to "Fix Committed" now cause is known and patch is available.
Changed in linux-source-2.6.20: | |
status: | Confirmed → Fix Committed |
wvengen (wvengen) wrote : | #31 |
I'd agree with you on the CPU start order.
To pinpoint the offending code, I instrumented some resume code with printk's and an msleep(10) after them (HZ=1000), so the TCSEL timer can jump in and report (it reports only when it detects a difference with the previous value). Result from this kernel which has lots of debugging info too:
[ 1.240004] CPU1 is up
[ 1.240008] kernel/
[ 1.262189] ALSA sound/pci/
[ 1.322099] usb 5-8: new high speed USB device using ehci_hcd and address 3
(...)
[ 1.759403] ACPI: EC: acpi_ec_wait timeout, status = 0, expect_event = 1
(...)
[ 1.761131] kernel/
[ 1.771415] kernel/
This tells us that suspend_finish() in kernel/power/main.c has the culprit:
pm_ops-
http://
Looking at the source, the only relevant call to pm_set_ops() is found in ACPI, so we need to look at acpi_pm_finish() http://
wvengen (wvengen) wrote : | #32 |
There is also a workaround that doesn't require patching the kernel, maybe this is of help to someone experiencing the problem right now.
Create a file /etc/acpi/
#!/bin/sh
# workaround resume bug on 2.6.20 (fixed in .22)
# see https:/
# this requires xserver-xorg-core to be installed, which is usually the case
pcitweak -w 00:1b:0 0x44 0
and make it executable (`chmod a+x /etc/acpi/
Please check that your audio controller is indeed on address 00:1b:0 using ` lspci | grep 'High Definition Audio Controller' `; if not, adjust accordingly.
Nice work-around Willem!
I've enhanced your script to make it more modular. Users can install this script and not have to check and set the PCI device ID as the script will figure it out.
#!/bin/bash
# this file is /etc/acpi/
# workaround for HDA resume bug on 2.6.20 (fixed in .22)
# see https:/
# this requires xserver-xorg-core to be installed, which is usually the case
DEVICEPATH="`ls -d /sys/module/
if [ "x$DEVICEPATH" != "x" ]; then
# extract PCI device ID from path
PCI=`expr match "$DEVICEPATH" '.*\/[0-
# replace dot with colon
PCI=`echo $PCI | tr "." ":"`
# read the register
TCSEL=`pcitweak -r $PCI 0x44`
echo "TCSEL=$TCSEL"
# test TCSEL against required value
if [ "$TCSEL" != "0x00000000" ]; then
echo "HDA TCSEL register needs correcting"
pcitweak -w $PCI 0x44 0
else
echo "HDA TCSEL register okay"
fi
fi
wvengen (wvengen) wrote : | #34 |
Thanks TJ, that's much better :)
Changed in linux-source-2.6.22: | |
status: | New → Fix Released |
Changed in linux-source-2.6.20: | |
status: | Fix Committed → Won't Fix |
MattBlack (oli-ellis) wrote : | #35 |
I've got Ubuntu 8.04 with AC97 sound, this seems to be happening to me. If I run the above script as root after resume from standby it reports
HDA TCSEL register needs correcting
But pcitweak doesn't seem to be available with hardy; neither is the above script installed
Launchpad Janitor (janitor) wrote : Kernel team bugs | #36 |
Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https:/
ALSA bug https:/ /bugtrack. alsa-project. org/alsa- bug/view. php?id= 2187 may be related