Comment 3 for bug 1284874

Revision history for this message
Bandan Das (2bsd) wrote :

The vfio code has logic that checks if a FLR is possible and attempts it before and after device assignment. Replacing the FLR with a bus reset succeeds past the stuck option rom loading phase and we are able to boot into the guest successfully which means that the first initialization (by the hardware) changes something in the nvram that needs to be reset back to default by a hard (bus) reset.

We could add an ugly hack to vfio to do a bus reset for this specific card, but it should be noted that FLR if supported, should be able to take care of this condition.

Note that it's really the FLR that's messing up the config space if it's attempted after the sequence of events leading upto the hang.

It's easy to reproduce this using setpci writes to the card followed by a FLR in the following manner -

#!/bin/bash
setpci -v -s 03:00.0 4.w=2
setpci -v -s 03:00.0 4.w
setpci -v -s 03:00.0 4.w=103
setpci -v -s 03:00.0 4.w
setpci -v -s 03:00.0 78.l=1
setpci -v -s 03:00.0 78.l
setpci -v -s 03:00.0 80.l=9430
setpci -v -s 03:00.0 80.l
setpci -v -s 03:00.0 78.l=a30c
setpci -v -s 03:00.0 78.l
setpci -v -s 03:00.0 80.l=7fffffff
setpci -v -s 03:00.0 80.l
setpci -v -s 03:00.0 78.l=a5dc
setpci -v -s 03:00.0 78.l
setpci -v -s 03:00.0 80.l=0
setpci -v -s 03:00.0 80.l
setpci -v -s 03:00.0 78.l=a2ec
setpci -v -s 03:00.0 78.l
setpci -v -s 03:00.0 80.l=3
setpci -v -s 03:00.0 80.l
setpci -v -s 03:00.0 78.l=a408
setpci -v -s 03:00.0 78.l
setpci -v -s 03:00.0 78.l=86420
setpci -v -s 03:00.0 78.l
setpci -v -s 03:00.0 80.l=4
setpci -v -s 03:00.0 80.l

echo 1 > reset #flr then completely corrupts the config space