Comment 11 for bug 1765232

Revision history for this message
Seth Forshee (sforshee) wrote :

I've been investigating, and I strongly suspect these commits for bug 1759723 are to blame.

 f0aff9ccc834 genirq/affinity: assign vectors to all possible CPUs
 9403a13fd07e blk-mq: simplify queue mapping & schedule with each possisble CPU

As far as I can tell there are 8 other fixes we'd need to look at including, either addressing the same original commit that these patches addressed or addressing bugs related to these two patches:

 16ccfff28976 nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
 8b834bff1b73 scsi: hpsa: fix selection of reply queue
 adbe552349f2 scsi: megaraid_sas: fix selection of reply queue
 b5b6e8c8d3b4 scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
 7bed45954b95 blk-mq: make sure hctx->next_cpu is set correctly
 a1c735fb7907 blk-mq: make sure that correct hctx->next_cpu is set
 bffa9909a6b4 blk-mq: don't keep offline CPUs mapped to hctx 0
 d3056812e7df genirq/affinity: Spread irq vectors among present CPUs as far as possible

Or we can revert those two patches. I'm considering both options, will provide one or two test kernels soon.