[RFE] Collect deployment logs from IPA
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Ironic |
Fix Released
|
Wishlist
|
Lucas Alvares Gomes | |
| ironic-python-agent |
Fix Released
|
Wishlist
|
Lucas Alvares Gomes |
Bug Description
Link to the spec: https:/
Problem
-------
Currently, there are few ways to access the logs from the IPA ramdisk when a deployment fails. None of the ways are easy to use or is intended to be used in production, e.g:
One could have a console session opened and watch the logs there. While this works, it's hard to use because we don't know which node will be pick by the scheduler at deployment time. Also, not all drivers do support console.
Another way is to disable powering off a node upon a deployment failure [0]. This method has some problems per-si:
0) It does not work in conjunction with nova, nova will call destroy() on the virt driver upon a failure which will power the node off in Ironic.
1) Leaving the nodes powered on after a failure is not desirable in some deployments.
Proposal
--------
This RFE introduces the work to retrieve the IPA system logs via its API and upload it to Swift.
Changes in IPA
~~~~~~~~~~~~~~
A new extension called "log" would be added to IPA, this extension will introduce a new command called "collect_
The logs will be collected from journald and if not present we should fallback and get the logs from the /var/log/* folder as well as dmesg and so on.
Changes in Ironic
~~~~~~~~~~~~~~~~~
The new IPA method will be be invoked upon a node deployment failure, if the command is not supported a warning message will be logged to alert the operator about it.
Two new configuration options will be added to Ironic:
0) "agent_
1) "agent_
Changed in ironic: | |
assignee: | nobody → Lucas Alvares Gomes (lucasagomes) |
importance: | Undecided → Wishlist |
tags: | added: rfe |
description: | updated |
Mathieu Mitchell (mat128) wrote : | #1 |
Lucas Alvares Gomes (lucasagomes) wrote : | #2 |
Hi Mathieu,
Thanks for reading the RFE.
So yes, there's no credentials being passed to IPA, the logs will be passed from the ramdisk to Ironic via the IPA API and Ironic is responsible for uploading it to Swift. This way we could even extend this behavior in the future, e.g, we may want to just save the logs locally on the conductor instead of uploading it to swift.
Mathieu Mitchell (mat128) wrote : | #3 |
Very interesting then :) I was concerned with the security aspect. Thanks for your quick reply.
By the way (haven't mentioned it in my first reply) but this is a very interesting feature, and would really save us in production from getting onto the machines to look at the logs :)
Lucas Alvares Gomes (lucasagomes) wrote : | #4 |
Link to the spec: https:/
description: | updated |
Changed in ironic: | |
status: | New → Confirmed |
Lucas Alvares Gomes (lucasagomes) wrote : | #5 |
setting rfe-approved since the spec is already merged
summary: |
- [RFE] Collect logs from IPA on deploy failure + [RFE] Collect deployment logs from IPA |
tags: |
added: rfe-approved removed: rfe |
Changed in ironic-python-agent: | |
assignee: | nobody → Lucas Alvares Gomes (lucasagomes) |
importance: | Undecided → Wishlist |
Changed in ironic-python-agent: | |
status: | New → Confirmed |
status: | Confirmed → In Progress |
Fix proposed to branch: master
Review: https:/
Changed in ironic: | |
status: | Confirmed → In Progress |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit af81914ce7309b2
Author: Lucas Alvares Gomes <email address hidden>
Date: Mon May 30 17:39:13 2016 +0100
Add a log extension
The log extension is responsible for retrieving logs from the system,
if journalctl is present the logs will come from it, otherwise we
fallback to getting the logs from the /var/log directory + dmesg logs.
In the coreos ramdisk, we need to bind mount /run/log in the container
so the IPA service can have access to the journal.
For the tinyIPA ramdisk, the logs from IPA are now being redirected to
/var/
stdout.
Inspector now shares the same method of collecting logs, extending its
capabilities for non-systemd systems.
Partial-Bug: #1587143
Change-Id: Ie507e2e5c58cff
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit cd7507f04b30938
Author: Lucas Alvares Gomes <email address hidden>
Date: Wed Jun 29 16:47:16 2016 +0100
Collect deployment logs from IPA
This patch adds the code to collect the deployment logs from the IPA
ramdisk. The logs can be collect for every deployment, upon a failure or
never. By default, logs are collected upon a failure.
After collection, logs can be storaged either in the local filesystem
(default) or in Swift.
If an error occurs when the logs are being collected, storaged or if the
ramdisk does not support the collect_system_logs command Ironic will log
an error message, but the deployment will proceed.
Documentation on how to enable and other configuration will be done on a
subsequent patch.
Partial-Bug: #1587143
Change-Id: I6da1110daa94ea
Fix proposed to branch: master
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 00df0890f00e408
Author: Lucas Alvares Gomes <email address hidden>
Date: Mon Aug 8 16:26:59 2016 +0100
Document retrieving logs from the deploy ramdisk
This patch is documenting how operators can configure Ironic to be able
to retrieve the logs from the deploy ramdisk (or disable it).
Closes-Bug: #1587143
Change-Id: I233e925f4dd9a1
Changed in ironic: | |
status: | In Progress → Fix Released |
Changed in ironic-python-agent: | |
status: | In Progress → Fix Released |
Would there be a possibility to avoid providing IPA with credentials? Ironic currently gives IPA a tempurl for image download via Swift and uses unauthenticated means of doing the lookup and heartbeat.
Is there a possibility to provide a tempurl to upload a Swift object?
Could Ironic query IPA for it's logs and upload that to Swift or simply log it itself, avoiding Swift? What amount of logs are we looking at?