Comment 11 for bug 1947403

Revision history for this message
Harald Jensås (harald-jensas) wrote :

I think the ages to boot is not quite accurrate, the node boots around:
2021-10-19 08:17:26.152 7 INFO ironic_inspector.introspect [-] [node: b981374c-3f42-4e99-8f85-c0b3109f4e38 state starting] Introspection started successfully

But we have no logs from that boot, we can see DHCP requests indicating and http transfers of inspector.ipxe and agent kernel/ramdisk succeedes. But the node never calls back, to fail/success the inspection. And also does not upload ramdisk logs.

There is a timeout, and inspection restarts - this time the node boots and inspection completes successfully. This time the ramdisk log is uploaded, which explain why we see "Ramdisk logs start from Oct 19 09:19:40"

I've pushed a some patches to try to improve the amount of data we capture in CI for inspection:

https://review.opendev.org/c/openstack/python-tripleoclient/+/814803
https://review.opendev.org/c/openstack/openstack-virtual-baremetal/+/814800
https://review.rdoproject.org/r/c/config/+/36355

That would give us the last 100 KiB of the console.log of each OVB Barmetal insance whenever it is powered off. 100 KiB because that is the hard-coded max console size in Nova: (MAX_CONSOLE_BYTES = 100 * units.Ki)

Since we can see ironic-python-agent is booting based on succesfull DHCP REQ/ACK's. It would be interresting to see what ironic-python-agent logs, why it is not able to callback to report success/failure.