Ok, so after diging a bit more, it appears I have found out what's going on.
So, because nova-compute register with the nova DB using the my_ip directive from its configuration file, the nova service host_ip is using this internal_api_interface.
When doing a live-migration, it doesn't crash because the migration is actually a direct migration between hosts.
When doing a cold-migration (as in resize instances), the flow is slightly different. In order for the ssh_execute() function to correctly create the ssh subprocess call, it use a dest argument, this argument come from the compute_host table in nova db. Unlike my first assumption, this argument isn't OS resolved, so the /etc/hosts isn't used in this flow.
So in here, I don't really know if that's something that we can actually fix from our side of things.
For now I myself used the same subnet/interface for both migration_interface and api_internal_interface in order to get my clusters working correctly.
Ok, so after diging a bit more, it appears I have found out what's going on.
So, because nova-compute register with the nova DB using the my_ip directive from its configuration file, the nova service host_ip is using this internal_ api_interface.
When doing a live-migration, it doesn't crash because the migration is actually a direct migration between hosts.
When doing a cold-migration (as in resize instances), the flow is slightly different. In order for the ssh_execute() function to correctly create the ssh subprocess call, it use a dest argument, this argument come from the compute_host table in nova db. Unlike my first assumption, this argument isn't OS resolved, so the /etc/hosts isn't used in this flow.
So in here, I don't really know if that's something that we can actually fix from our side of things. interface in order to get my clusters working correctly.
For now I myself used the same subnet/interface for both migration_interface and api_internal_