outl 0xcf8 0x8000fa24
outl 0xcfc 0xe106c000 (Writes e106c00 to BAR5 for 0:31:2)

outl 0xcf8 0x8000fa04
outw 0xcfc 0x7 (Enables BM, Memory IO and PIO for 0:31:2)

outl 0xcf8 0x8000fb20 (Enables 0:31:3, I guess? My PCI knowledge is iffy. We set the enable bit and select BAR4, but then we don't actually write to 0xcfc again to set anything.) 


write 0x0 0x3 0x2780e7
- write these three bytes to addr 0 in memory.

write 0xe106c22c 0xd 0x1130c218021130c218021130c2
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxSCTL] @ 0x2c: 0x18c23011
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxSERR] @ 0x30: 0xc2301102
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxSACT] @ 0x34: 0x30110218
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxCI] @ 0x38: 0x000000c2

write 0xe106c218 0x15 0x110010110010110010110010110010110010110010

- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxCMD] @ 0x18: 0x11100011
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:Reserved] @ 0x1c: 0x00111000
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxTFD] @ 0x20: 0x10001110
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxSIG] @ 0x24: 0x11100011
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxSSTS] @ 0x28: 0x00111000
- ahci_port_write ahci(0x555c950f71a0)[2]: port write [reg:PxSCTL] @ 0x2c: 0x00000010

Not all of those register writes are actually important for the bug, so I simplified them to the fewest writes and fewest bits.

outl 0xcf8 0x8000fa24
outl 0xcfc 0xe106c000
outl 0xcf8 0x8000fa04
outw 0xcfc 0x7
outl 0xcf8 0x8000fb20
write 0x0 0x3 0x2780e7
write 0xe106c22c 0x4 0x01000000
write 0xe106c238 0x2 0x02
write 0xe106c218 0x4 0x11000000
write 0xe106c22c 0x1 0x00


1. PxSCTL write arms the DET bit. It isn't intended to be left on when PxCMD.ST (Start) is issued. It's not clear what should happen if this DOES occur. (Undefined behavior, at the very least.)
See AHCI 1.3 section 3.3.1.1 "Offset 2Ch: PxSCTL – Port x Serial ATA Control (SCR2: SControl)"

This bit is intended to send a reset signal to attached SATA drives. QEMU just synchronously resets the drive because we can.


2. PxCI is not intended to be written to when PxCMD.ST is unset. The spec suggests that when ST transitions from '1' to '0' that this field is cleared, but it does not suggest what happens when it transitions from '0' to '1'. QEMU will happily set the register.


3. PxCMD write: This sets PxCMD.ST and PxCMD.FRE, which engages the AHCI device in earnest.

At this point, AHCI sees outstanding commands and tries to process them. The FIS receive address is never programmed, so it's at zero. We start reading a FIS there:

15712@1590789960.784835:handle_cmd_fis_dump ahci(0x55b4c56621a0)[2]: FIS:
0x00: 27 80 e7 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x40: 34 40 70 01 01 14 eb 20 00 00 00 00 01 00 00 00 
0x50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0x70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

This is translated as:
0x27 SATA_FIS_TYPE_REGISTER_H2D
0x80 SATA_FIS_REG_H2D_UPDATE_COMMAND_REGISTER
0xe7 command = FLUSH CACHE

This will engage ide_flush_cache() (core.c)


At this point I get a little confused. I wouldn't think we'd have a BlockBackend here for ide_flush to work on, but it seems to think we do and allows the flush command to proceed. We then immediately try to cancel this flush, but bdrv_aio_cancel can't tolerate an aiocb with a null BDS and panics.

...Hm, it should be the case that blk_do_flush detects this as ENOMEDIUM, but are we maybe just canceling this request too fast? I actually can't trigger this through the console, but I can trigger it by redirecting input from a .txt file.

Yup, OK: if you look in blk_aio_prwv, we schedule a oneshot to invoke the callback on a synchronous failure, but we are managing to inject the reset command before the oneshot gets a chance to run.

I think either blk_aio_cancel or bdrv_aio_cancel needs to check that there isn't a dangling BH callback -- it seems wrong to make it as far as bdrv_aio_cancel when there's no BDS, but the IDE device no longer has any idea why its callback hasn't returned yet, and blk_aio_cancel is the only mechanism it has to kick the state machine forward.