Comment 0 for bug 1587143

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote : [RFE] Collect logs from IPA on deploy failure

Problem
~~~~~~~

Currently, there are few ways to access the logs from the IPA ramdisk when a deployment fails. None of the ways are easy to use or is intended to be used in production, e.g:

One could have a console session opened and watch the logs there. While this works, it's hard to use because we don't know which node will be pick by the scheduler at deployment time. Also, not all drivers do support console.

Another way is to disable powering off a node upon a deployment failure [0]. This method has some problems per-si:

0) It does not work in conjunction with nova, nova will call destroy() on the virt driver upon a failure which will power the node off in Ironic.

1) Leaving the nodes powered on after a failure is not desirable in some deployments.

Proposal
~~~~~~~~

This RFE introduces the work to retrieve the IPA system logs via its API and upload it to Swift.

Changes in IPA
~~~~~~~~~~~~~~

A new extension called "log" would be added to IPA, this extension will introduce a new command called "collect_system_logs" which will collect the logs from the system, gzip it, base64 encode the binary and return the result string.

The logs will be collected from journald and if not present we should fallback and get the logs from the /var/log/* folder as well as dmesg and so on.

Changes in Ironic
~~~~~~~~~~~~~~~~~

The new IPA method will be be invoked upon a node deployment failure, if the command is not supported a warning message will be logged to alert the operator about it.

Two new configuration options will be added to Ironic:

0) "agent_retrieve_logs_on_deploy_failure": (Boolean) If True retrieve the logs from IPA when the deployment fails. Defaults to False.

1) "agent_logs_swift_container": (String) Name of the Swift container to store the deployment logs.

[0] https://review.openstack.org/#/c/259119/