While this might not affect that many "production deployments" as we expect that they are probably not running internal API, external, control plane, etc. networks over a bridge that is configured in the neutron-ovs-agent, test environments (like the one featured in this BZ) very possibly would. People evaluating POC deployments that would be affected would also get a poor impression.
To work this out I think we need to:
1 - reproduce the nova try-restart hang in the absence of the API network and get nova looking at it. I *think* this is the broken link as to why this no longer works, so if we can resolve this ...
2 - show that a puppet apply will the ovs agent back and resolve the network connection issues if a service is stuck (like in 1.)
3 - investigate if it's possible to mitigate the situation where the neutron ovs agent is stopped as a side-effect and not restarted. Nova might not be the only service that this happens with.
While this might not affect that many "production deployments" as we expect that they are probably not running internal API, external, control plane, etc. networks over a bridge that is configured in the neutron-ovs-agent, test environments (like the one featured in this BZ) very possibly would. People evaluating POC deployments that would be affected would also get a poor impression.
To work this out I think we need to:
1 - reproduce the nova try-restart hang in the absence of the API network and get nova looking at it. I *think* this is the broken link as to why this no longer works, so if we can resolve this ...
2 - show that a puppet apply will the ovs agent back and resolve the network connection issues if a service is stuck (like in 1.)
3 - investigate if it's possible to mitigate the situation where the neutron ovs agent is stopped as a side-effect and not restarted. Nova might not be the only service that this happens with.