So this problem seems to have been hardware related ultimately.
On further investigation, after Tim moved swapped identical cards between two serves and the failure followed the older card, I got to looking and found that the firmware on the failing card was several revs below the firmware on the newer card, which was also 1 rev down from the most recent firmware for the LSI 9240-4i.
So I flashed the firmware on both cards to the latest rev (20.10.1-0061) and this seems to have cured the problem.
I can manually cat the vpd file in sysfs repeatedly without failure and I have also been able to run checkbox (which initially triggered the issue) again without failure.
So I think we can call this a firmware issue. I'm closing it as such. If this re-occurrs for some reason, we can reopen or just open a new bug then.
So this problem seems to have been hardware related ultimately.
On further investigation, after Tim moved swapped identical cards between two serves and the failure followed the older card, I got to looking and found that the firmware on the failing card was several revs below the firmware on the newer card, which was also 1 rev down from the most recent firmware for the LSI 9240-4i.
So I flashed the firmware on both cards to the latest rev (20.10.1-0061) and this seems to have cured the problem.
I can manually cat the vpd file in sysfs repeatedly without failure and I have also been able to run checkbox (which initially triggered the issue) again without failure.
So I think we can call this a firmware issue. I'm closing it as such. If this re-occurrs for some reason, we can reopen or just open a new bug then.