Comment 4 for bug 2061362

Revision history for this message
Adam Rozman (rozzix) wrote (last edit ):

Thanks for the great feedback, just as a reference for readers, this is the implementation
https://review.opendev.org/c/openstack/ironic-python-agent/+/915825

In my particular case I have indirectly used a set of machines in a lab that were attached to a FCoE SAN but the faulty "disk" was not actually part of the SAN.
The machines had local disks too and they also had strange devices that I haven't managed to figure out where exactly they were coming from but my suspicion is that these were simulated usb devices attached by the bmc:
```
model: SPI Flash LUN
name: /dev/disk/by-path/pci-0000:00:14.0-usb-0:4:1.0-scsi-0:0:0:2
```
They actually had no LUN, WWN or serial numbers btw...

These devices were picked up by the host OS both SLES 15 SP4/SP5 and Cenots 9 stream but the devices caused different issues on the OSes.
On centos the linux kernel was throwing errors related to the devices (dmsg, jorunal) but the IPA skipped over these devices during cleanup, I guess because of the kernel error these devices were not properly presented to IPA via SYSFS but not sure what caused the exclusion exactly. With the same IPA version on SLES the kernel managed to handle the bad disks, there were no kernel errors but accessing such devices caused I/O errors and failed cleanups.

When metadata cleanup was turned off during the deployment IPA only cleaned a single disk designated by the root device hint so that has worked as expected, but my
requirements stated that stakeholder wants cleanup to work for all the disks and if there is a faulty disk, then that should be "ignored/skipped".

I completely understand your view that you feel like in principle if there is a faulty disk then that is a faulty machine, but as I stated in the original issue text this assumption
is not correct in every case (some users just don't care).

Honestly I have not tried overwriting the cleanup step because I am using Ironic via Metal3 and I don't think there is a possibility to overwrite IPA/Ironic steps in Metal3 yet. Or at least I haven't figured out how to do it. So that is why I was looking for a simpler approach that would work for every type of Ironic deployment.

EDITS:
grammar + typos