Comment 15 for bug 1012144

Revision history for this message
James Troup (elmo) wrote :

Hi,

I'm also seeing the same problem on Folsom from the Ubuntu Cloud
Archive running on Ubuntu 12.04 LTS.

Instance A can not talk to instance B's floating IP address. Instance
A can talk to instance B's fixed IP address.

I see traffic go from instance A, hit the cloud controller, at which
point it hits the DNAT rule in nova-network-PREROUTING and gets
dropped on the floor¹. I assume this is because the bridge isn't
running in hairpin mode but it's a production cloud and I didn't want
to experiment to confirm that.

In any event, turning on hairpin mode wouldn't actually help me
because at that point we'd have traffic going:

 * [A] -> src=A-fixed, dest=B-floating -> [Cloud controller]
 * [Cloud controller] -> src=A-fixed, dest=B-fixed -> [B]

But [A] and [B] are on the same subnet, so B's reply would be direct,
i.e.:

 * [B] -> src=B-> fixed dest=A-fixed -> [A]

Unfortunately [A] is expecting a reply from the [Cloud Controller],
not [B] so it would throw the packet from [B] away.

This could be fixed with SNAT on the cloud controller but that would
mess with the ability to restrict access by IP to floating IPs via
security groups.

I think the correct solution is for the floating IP DNAT rules to also
be run on the hypervisors/compute hosts; doing so fixed the problem
for me.

| ubuntu@juju-machine-A:~$ nc -w1 -v -q0 -z 91.189.92.32 443
| nc: connect to 91.189.92.32 port 443 (tcp) timed out: Operation now in progress
| ubuntu@juju-machine-A:~$

If I then go onto the compute host and add a fake DNAT rule similar to
what is on the cloud controller:

| root@leuce:~# iptables -t nat -I PREROUTING -d 91.189.92.32/32 -j DNAT --to-destination 10.33.16.157

Machine A can then talk to machine B on it's floating IP:

| ubuntu@juju-machine-A:~$ nc -w1 -v -q0 -z 91.189.92.32 443
| Connection to 91.189.92.32 443 port [tcp/https] succeeded!
| ubuntu@juju-machine-A:~$

From a quick check of one of our development clouds that runs grizzly,
it looks like this isn't fixed there either.

--
James

¹ This is an assumption based on the debug output of an iptables -j
  TRACE rule.