Comment 159 for bug 1803179

Revision history for this message
In , mfulz (mfulz-linux-kernel-bugs) wrote :

(In reply to arpie from comment #142)
> (In reply to Matthias Fulz from comment #141)
> > (In reply to arpie from comment #140)
> [snip]
> > > If I completely disable the audio card using :
> > > echo 1 | sudo tee /sys/bus/pci/devices/0000:01:00.1/remove
> > >
> > > Then the system hangs are completely cured
> >
> > Not working for me. Still freezing with this.
>
> Any chance of more details? When and how is it freezing? Is it any
> different from before? What are your machine/card details (looks like you
> haven't posted these anywhere above)?
>

I've got a HP OMEN 15 with a nvidia GTX 1050 running archlinux

> Also, are you absolutely sure you've disabled the audio card during boot
> *before the kernel notices it is there*? The only reliable way I've found
> to check if this is the case, is to run powertop, and look in the 'Device
> Status' tab for listings of 'Audio codec hwXXXXX: nvidia'. If that is
> showing up, then the nvidia sound card is still active and will cause hangs.
> My solution only works if the audio card is removed/disabled before the
> audio system initialises during boot (hence the WantedBy=sysinit.target in
> my service file).
>

I've used your service file together with bumblebee and bbswitch.

> I think I should have also mentioned that in order for the kernel to do the
> PM, you need to do something like :
>
> echo auto | sudo tee /sys/bus/pci/devices/0000:01:00.0/power/control
>
> I have TLP installed, which does this for me.
>

Ok this step was missing.

> Now a few days have passed, I admit I have had a few freezes when using
> bbswitch. But if I disable bbswitch and just use bumblebee with no power
> management, all is well (so far). If I want to power down the nvidia GFX
> card I just manually modprobe -r nvidia and the kernel does the rest.
> Using this solution, I see a drop from about 20W to 10W when the card powers
> off, with no ACPI calls at all (or, rather, none that I am aware of - I have
> no idea what the kernel is actually doing behind the scenes).
>

Ah I see.
Then I think this is basically somehow similar to my workaround using the snd_hda_intel modul parameter.
The nvidia card will just be completely "powered off" by not using it in any way (no module loaded)

> I am sure that there must be a 'proper' solution where the correct ACPI
> commands are used to power off/on both the nvidia video and audio at the
> same time but finding such a solution is far beyond me...

I think some ACPI / PM guys should definitly check the audio part of the GPU as there could be some issues related to this bug.

I will try it perhaps once again and give feedback here.
But honestly these tests are really harmful for me because it happens very often that some files are truncated to zero during this crash randomly and I've to restore backups then...