Here is the question, why does openstackbmc fail to actually power off the machines:
[ 3718.893431] openstackbmc[14225]: Powered off bd155666-3488-42b0-a39e-28f726399af7
[ 3721.038759] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3724.459366] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7
[ 3726.658522] openstackbmc[14258]: Powered off 29945b59-a1a4-4ea8-b651-89cced710943
[ 3729.990707] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7
[ 3732.823212] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943 <-- Why is the instance *still* running after we told it to power off 11 seconds prior?
[ 3735.625339] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7 <-- Why is the instance *still* running when we told it to power off 17 seconds prior?
[ 3737.995547] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3741.144174] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7
[ 3743.521301] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3747.360741] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7
[ 3749.052793] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3753.872864] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7
[ 3755.096879] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3763.672337] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7
[ 3765.150681] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3777.503347] openstackbmc[14258]: Reporting power state "True" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3779.204387] openstackbmc[14225]: Reporting power state "True" for instance bd155666-3488-42b0-a39e-28f726399af7
[ 3784.885856] openstackbmc[14225]: Reporting power state "False" for instance bd155666-3488-42b0-a39e-28f726399af7 <-- Finally "off" 66 seconds later.
[ 3794.906170] openstackbmc[14258]: Reporting power state "False" for instance 29945b59-a1a4-4ea8-b651-89cced710943 <-- Finally "off", 68 seconds later.
[ 3800.534717] openstackbmc[14258]: Reporting power state "False" for instance 29945b59-a1a4-4ea8-b651-89cced710943
[ 3806.221619] openstackbmc[14258]: Powered on 29945b59-a1a4-4ea8-b651-89cced710943
It is almost as if somewhere along the way "power off" is being translated to "soft power off.
The challenge is openstackbmc calls os-stop, os-stop being an async action with nova.
There is also ironic.conf's [conductor]power_state_change_timeout which defaults to 60 seconds. Tuning either option such that we don't exceed what Ironic's setting is should be fine. We could also just extend ironic's setting.
The root cause is likely some process is hanging on these images, but we're going to need to see a time window we don't capture from the system console before the power-off action completes or logs most likely to understand what is going on.
Here is the question, why does openstackbmc fail to actually power off the machines:
[ 3718.893431] openstackbmc[ 14225]: Powered off bd155666- 3488-42b0- a39e-28f726399a f7 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 14258]: Powered off 29945b59- a1a4-4ea8- b651-89cced7109 43 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 <-- Why is the instance *still* running after we told it to power off 11 seconds prior? 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 <-- Why is the instance *still* running when we told it to power off 17 seconds prior? 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14258]: Reporting power state "True" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14225]: Reporting power state "True" for instance bd155666- 3488-42b0- a39e-28f726399a f7 14225]: Reporting power state "False" for instance bd155666- 3488-42b0- a39e-28f726399a f7 <-- Finally "off" 66 seconds later.
[ 3721.038759] openstackbmc[
[ 3724.459366] openstackbmc[
[ 3726.658522] openstackbmc[
[ 3729.990707] openstackbmc[
[ 3732.823212] openstackbmc[
[ 3735.625339] openstackbmc[
[ 3737.995547] openstackbmc[
[ 3741.144174] openstackbmc[
[ 3743.521301] openstackbmc[
[ 3747.360741] openstackbmc[
[ 3749.052793] openstackbmc[
[ 3753.872864] openstackbmc[
[ 3755.096879] openstackbmc[
[ 3763.672337] openstackbmc[
[ 3765.150681] openstackbmc[
[ 3777.503347] openstackbmc[
[ 3779.204387] openstackbmc[
[ 3784.885856] openstackbmc[
[ 3794.906170] openstackbmc[ 14258]: Reporting power state "False" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 <-- Finally "off", 68 seconds later.
[ 3800.534717] openstackbmc[ 14258]: Reporting power state "False" for instance 29945b59- a1a4-4ea8- b651-89cced7109 43 14258]: Powered on 29945b59- a1a4-4ea8- b651-89cced7109 43
[ 3806.221619] openstackbmc[
It is almost as if somewhere along the way "power off" is being translated to "soft power off.
The challenge is openstackbmc calls os-stop, os-stop being an async action with nova.
Per discussion with Sean Mooney, the stop command with nova is a graceful shutdown, which then falls back to a forceful shutdown. https:/ /docs.openstack .org/nova/ latest/ configuration/ config. html#DEFAULT. graceful_ shutdown_ timeout
There is also ironic.conf's [conductor] power_state_ change_ timeout which defaults to 60 seconds. Tuning either option such that we don't exceed what Ironic's setting is should be fine. We could also just extend ironic's setting.
The root cause is likely some process is hanging on these images, but we're going to need to see a time window we don't capture from the system console before the power-off action completes or logs most likely to understand what is going on.