Comment 2 for bug 1910946

Revision history for this message
norman shen (jshen28) wrote :

Thank you for reply. Actually ovs might have other problems which will make command like
`ovs-ofctl dump-flows` hang, but when this happens subsequent operation on modifying flow
tables will timeout and fail.

And since we already could detect ovs problem early, why not stop sending heartbeats and fail quickly
rather than let cloud operator figure why live migration or spawning new instance timeout?

As for replacing libc, we actually did it on some env but since ubuntu 18.04 isn't going to fix the problem,
we actually took the risk and use libc6 from 20.04 which could be risky.... not mentioning upgrading is pretty painful because we have to live migrate instance first... but this hangup could be easily solved by something like a gdb attach, so I think it might be beneficial to expose ovs down as soon as ovs agent detects it.

Anyway my point we might need to treat ovs and ovs agent as a whole unit....