quickstart should collect information on the BMC node and baremetal console

Bug #1785055 reported by Gabriele Cerami
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Gabriele Cerami

Bug Description

When introspection or IPMI fail, we don't have much informations to work on for the debugging.
We should add to the log collection
- relevant IPMI logs from the BMC node
- Console transcript from the unprovisioned nodes.

Tags: ci quickstart
Revision history for this message
Gabriele Cerami (gcerami) wrote :

I did some analysis of the requirements to get these additional information

Relevant IPMI logs from the BMC node:

We need to add the BMC node to the inventory, then add the relevant paths to the collect logs role for the BMC. To do this we need BMC IP and credentials to ssh.
The IP can be recovered in the instackenv.json pm_addr field.
The credentials are a bit more complex. In OVB the BMC node is created by the underlying cloud, using tenant credentials and a tenant generic private key. If we want to access it instead, we should use the undercloud key. So we need to change BMC creation slightly to use a different key. Difficult with te-broker, easier with direct provisioning.

Console transcript:

Here too, the situation is complex, overcloud nodes are created by the underlying cloud using ipxe images. To access their console, we need the tenant credentials.
Te broker has those and can recover the logs before deleting the node, but the difficult part is return these logs to the undercloud node for collection.
Direct provisioning again could make this a lot easier.

Revision history for this message
Gabriele Cerami (gcerami) wrote :

working on https://review.openstack.org/588488.
The idea is then retrieve this logs with a curl command from the collect_logs on the job itself.
We are still missing the BMC nonconsole log gathering

Revision history for this message
wes hayutin (weshayutin) wrote :

@Gabriele, this is quite critical atm.
Note the bmc console log from a manual run
https://bugs.launchpad.net/tripleo/+bug/1785342

Updating DNS did not resolve the issue

Revision history for this message
Gabriele Cerami (gcerami) wrote :

Console logs are available for now at http://38.145.34.41/console-logs/. As I mentioned, baremetal node log show only the last reboot.

Changed in tripleo:
milestone: rocky-rc1 → rocky-rc2
Revision history for this message
Alex Schultz (alex-schultz) wrote :

Moving milestone to Stein-1 as this is not required for Rocky RC2.

Changed in tripleo:
milestone: rocky-rc2 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.