(In reply to comment #11)
> BTW, while a suspend-resume should reset the gpu, I see this:
>
> [31055.564022] [drm] Manually setting wedged to 0
> [31055.564022] [drm:i915_reset] *ERROR* Failed to reset chip.
> Why does it fail?
It fails because we have not found the means to successfully reset that chipset yet. It may well be the only way is to power cycle the PCI device. Meh.
> The units are not busy anymore according to intel_gpu_top, so I'd expect "echo
> 0 > /sys/kernel/debug/dri/0/i915_wedged" should unwedge it, but it doesn't
The units are idle because the chip hit a fatal error and disabled those units.
(In reply to comment #11)
> BTW, while a suspend-resume should reset the gpu, I see this:
>
> [31055.564022] [drm] Manually setting wedged to 0
> [31055.564022] [drm:i915_reset] *ERROR* Failed to reset chip.
> Why does it fail?
It fails because we have not found the means to successfully reset that chipset yet. It may well be the only way is to power cycle the PCI device. Meh.
> The units are not busy anymore according to intel_gpu_top, so I'd expect "echo debug/dri/ 0/i915_ wedged" should unwedge it, but it doesn't
> 0 > /sys/kernel/
The units are idle because the chip hit a fatal error and disabled those units.