iptables NAT rules set by openstack-l3-agent are incomplete for AiO setups

Bug #1079926 reported by Martin Gerhard Loschwitz
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

In order to allow access to the metadata service (169.254.169.254), quantum-l3-agent sets NAT rules for the affected router namespace:

-t nat -A quantum-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.122.111:8775

For setups where all services are running on the same host, this is insufficient. The rule above is simply skipped for packages that were generated by local processes. To make it work, the following rule is required:

-t nat -A quantum-l3-agent-PREROUTING -s 0.0.0.0/0 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 8775

With that rule in place, VMs can reach the metadata service nicely.

tags: added: folsom-backport-potential
Revision history for this message
Robert Collins (lifeless) wrote :

That would redirect all port 80 traffic to port 8775, which seems improbably broad.

Revision history for this message
Martin Gerhard Loschwitz (martin-loschwitz) wrote :

The proposed fix was the same fix that was once implemented in OpenStack nova-network. There may be better solutions to work around the problem, I just don't know them. :)

Revision history for this message
yong sheng gong (gongysh) wrote :

helo,
I cannot wget http://169.254.169.254/ from within my VM with your rule. But I found it works if I ran:
sudo ip netns exec qrouter-922c8fe9-5297-4327-aefe-03b785b03eb6 iptables -t nat -D quantum-l3-agent-POSTROUTING -s 10.0.1.0/24 -d 9.0.1.10/32 -j ACCEPT

9.0.1.10/3 is metadata server
10.0.1.0/24 is fixed_ip's subnet.

Revision history for this message
Martin Gerhard Loschwitz (martin-loschwitz) wrote :

After some more fiddling, what I think that is really required is this:

-t nat -A quantum-l3-agent-OUTPUT -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.122.111:8775

Note that this is for OUTPUT, not PREROUTING.

Revision history for this message
dan wendlandt (danwent) wrote :

are you trying to access metadata from the hypervisor directly, rather than from a VM? All VM traffic should traverse the PREROUTING chain, unless I'm missing something. The OUTPUT chain should only be needed if you're trying to access the 169.154.169.254 from the same IP stack + namespace as the router.

Revision history for this message
yong sheng gong (gongysh) wrote :

Dan:
I am running wget http://169.254.169.254/ from the VM by nova boot.

Revision history for this message
yong sheng gong (gongysh) wrote :

This is my current iptables rules in router namespace: (10.0.1.0/24 fixed net, 9.0.1.0/24 pubnet, 9.0.1.10 metadata server)
*nat
:PREROUTING ACCEPT [1:304]
:INPUT ACCEPT [1:304]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:quantum-l3-agent-OUTPUT - [0:0]
:quantum-l3-agent-POSTROUTING - [0:0]
:quantum-l3-agent-PREROUTING - [0:0]
:quantum-l3-agent-float-snat - [0:0]
:quantum-l3-agent-snat - [0:0]
:quantum-postrouting-bottom - [0:0]
-A PREROUTING -j quantum-l3-agent-PREROUTING
-A OUTPUT -j quantum-l3-agent-OUTPUT
-A POSTROUTING -j quantum-l3-agent-POSTROUTING
-A POSTROUTING -j quantum-postrouting-bottom
-A quantum-l3-agent-POSTROUTING ! -i qg-1114f0ea-29 ! -o qg-1114f0ea-29 -m conntrack ! --ctstate DNAT -j ACCEPT
-A quantum-l3-agent-POSTROUTING -s 10.0.1.0/24 -d 9.0.1.10/32 -j ACCEPT
-A quantum-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 9.0.1.10:8775
-A quantum-l3-agent-snat -j quantum-l3-agent-float-snat
-A quantum-l3-agent-snat -s 10.0.1.0/24 -j SNAT --to-source 9.0.1.2
-A quantum-postrouting-bottom -j quantum-l3-agent-snat

I think it is due to the rule ' -A quantum-l3-agent-POSTROUTING -s 10.0.1.0/24 -d 9.0.1.10/32 -j ACCEPT' cut shorts the rule '-A quantum-l3-agent-snat -s 10.0.1.0/24 -j SNAT --to-source 9.0.1.2'.
So remove the ' -A quantum-l3-agent-POSTROUTING -s 10.0.1.0/24 -d 9.0.1.10/32 -j ACCEPT' helps.

the metadata server is running on pubnet with 9.0.1.10. To remove the ' -A quantum-l3-agent-POSTROUTING -s 10.0.1.0/24 -d 9.0.1.10/32 -j ACCEPT' rule will enable the Vm to access the metadata with SNATed ip 9.0.1.2, for which the metadata cannot return the right medata data to the VM.

I have no way to run the metadata server right.

Revision history for this message
dan wendlandt (danwent) wrote :

The metadata server works for me with existing code on my single-node devstack setup. Is this not the case for others? Or is there a difference that I'm missing?

Revision history for this message
Martin Gerhard Loschwitz (martin-loschwitz) wrote :

Hello again,

sorry for the long detail, it took me a while to figure out what exactly was going wrong here. I think it is reasonably possible to assume that the problem was rp_filter not being set properly on the host system, which lead to problems caused by the fact that the kernel forbid asynchronous routing. I guess we can set this bug to invalid.

Revision history for this message
dan wendlandt (danwent) wrote :

odd... i have rp_filter = 1 on what seems to be all of my interfaces, but I do not have trouble. If a particular rp_filter value is required for this to work correctly, we should make sure that is documented.

Revision history for this message
dan wendlandt (danwent) wrote :

marking this as incomplete until we understand if there was actually something broken here, as I think in most cases single-node setup works.

Changed in quantum:
status: New → Incomplete
Revision history for this message
Byron McCollum (byron-mccollum) wrote :

I'm having the same problem with my AIO install. The metadata server is inaccessible...

Revision history for this message
Florian Haas (fghaas) wrote :

This appears to be broken in non-all-in-one configurations too.

If you're following http://docs.openstack.org/trunk/openstack-network/admin/content/connectivity.html, and you're running

- quantum-server on the box that also runs nova-api;
- nova-compute on a different box that also runs quantum-plugin-openvswitch-agent;
- quantum-l3-agent and quantum-dhcp-agent on yet another box that also runs quantum-plugin-openvswitch-agent;

Then even though all hosts have perfect management network connectivity _and_ established GRE tunnels, and a booting instance can hit its DHCP server just fine on the OpenVSwitch network and gets an IP address as it should, the router in the qrouter namespace responds with a destination host unreachable for 169.254.169.254.

And that appears to be completely irrespective of rp_filter configurations or OUTPUT NAT rules.

Revision history for this message
dan wendlandt (danwent) wrote : Re: [Bug 1079926] Re: iptables NAT rules set by openstack-l3-agent are incomplete for AiO setups

To confirm, have you setup things as describe here:
http://docs.openstack.org/trunk/openstack-network/admin/content/adv_cfg_l3_agent_metadata.html?

We have a patch in progress to the admin docs to include some metadata
troubleshooting steps, can you try those:
https://bugs.launchpad.net/openstack-manuals/+bug/1078528

Dan

On Tue, Nov 27, 2012 at 9:46 AM, Florian Haas <email address hidden> wrote:

> This appears to be broken in non-all-in-one configurations too.
>
> If you're following http://docs.openstack.org/trunk/openstack-
> network/admin/content/connectivity.html, and you're running
>
> - quantum-server on the box that also runs nova-api;
> - nova-compute on a different box that also runs
> quantum-plugin-openvswitch-agent;
> - quantum-l3-agent and quantum-dhcp-agent on yet another box that also
> runs quantum-plugin-openvswitch-agent;
>
> Then even though all hosts have perfect management network connectivity
> _and_ established GRE tunnels, and a booting instance can hit its DHCP
> server just fine on the OpenVSwitch network and gets an IP address as it
> should, the router in the qrouter namespace responds with a destination
> host unreachable for 169.254.169.254.
>
> And that appears to be completely irrespective of rp_filter
> configurations or OUTPUT NAT rules.
>
> --
> You received this bug notification because you are a member of Netstack
> Core Developers, which is subscribed to quantum.
> https://bugs.launchpad.net/bugs/1079926
>
> Title:
> iptables NAT rules set by openstack-l3-agent are incomplete for AiO
> setups
>
> Status in OpenStack Quantum (virtual network service):
> Incomplete
>
> Bug description:
> In order to allow access to the metadata service (169.254.169.254),
> quantum-l3-agent sets NAT rules for the affected router namespace:
>
> -t nat -A quantum-l3-agent-PREROUTING -d 169.254.169.254/32 -p tcp -m
> tcp --dport 80 -j DNAT --to-destination 192.168.122.111:8775
>
> For setups where all services are running on the same host, this is
> insufficient. The rule above is simply skipped for packages that were
> generated by local processes. To make it work, the following rule is
> required:
>
> -t nat -A quantum-l3-agent-PREROUTING -s 0.0.0.0/0 -p tcp -m tcp
> --dport 80 -j REDIRECT --to-ports 8775
>
> With that rule in place, VMs can reach the metadata service nicely.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/quantum/+bug/1079926/+subscriptions
>

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dan Wendlandt
Nicira, Inc: www.nicira.com
twitter: danwendlandt
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Revision history for this message
Florian Haas (fghaas) wrote :

Thanks Dan. While we're at it, the documentation could use some clarification there:

"Accessing from VMs to Nova metadata service is forwarded to an external network through Quantum L3 router. Nova metadata service must be reachable from the external network."

Does that mean the L3 agent _takes care_ of making it reachable from the outside, or does it have to have an external IP address before Quantum ever interacts with it?

"metadata_ip = 10.56.51.210
metadata_port = 8775"

That's clearly a private IP, does that mean you set the management network IP address of the metadata service here, or the external, publicly accessible one?

"In addition, a routing setting on the host running the metadata service is required. For example, when VM launched on a network 172.18.1.0/24 accesses the Nova metadata service, the source IP address is in the above subnet, so we need to add an additional routing entry by the following command. You need to configure routing entries like this for each subnet on which VMs will be launched.

route add -net 172.18.11.0/24 gw $ROUTER_GW_IP
where $ROUTER_GW_IP is an IP address of the interface of the Quantum router connected to the external network."

So a newly launched VM goes out of the private network, takes a detour through the external network, before it hits the nova metadata service that may be running on the very same physical host that the new guest has just been spawned on?

Revision history for this message
dan wendlandt (danwent) wrote :
Download full text (4.0 KiB)

On Tue, Nov 27, 2012 at 10:38 AM, Florian Haas <email address hidden> wrote:

> Thanks Dan. While we're at it, the documentation could use some
> clarification there:
>
> "Accessing from VMs to Nova metadata service is forwarded to an external
> network through Quantum L3 router. Nova metadata service must be
> reachable from the external network."
>
> Does that mean the L3 agent _takes care_ of making it reachable from the
> outside, or does it have to have an external IP address before Quantum
> ever interacts with it?
>

The l3-agent only takes care of making sure it has the default route set to
the gateway-ip of the external subnet. Its up to whomever sets of the
physical network to make sure that that default gateway IP is capable of
routing to your metadata server IP. This requirement is the same as if you
are running nova-network and nova-api on separate physical hosts.

>
> "metadata_ip = 10.56.51.210
> metadata_port = 8775"
>
> That's clearly a private IP, does that mean you set the management
> network IP address of the metadata service here, or the external,
> publicly accessible one?
>

This is why we used the term "external" and not "public". If openstack is
running open to tenants anywhere in the world, the external network would
likely be using public IP space. If openstack is running as a private
cloud, the "external" network may be "private" IP space reachable only
within the tenant's private address space.

>
> "In addition, a routing setting on the host running the metadata service
> is required. For example, when VM launched on a network 172.18.1.0/24
> accesses the Nova metadata service, the source IP address is in the
> above subnet, so we need to add an additional routing entry by the
> following command. You need to configure routing entries like this for
> each subnet on which VMs will be launched.
>
> route add -net 172.18.11.0/24 gw $ROUTER_GW_IP
> where $ROUTER_GW_IP is an IP address of the interface of the Quantum
> router connected to the external network."
>
> So a newly launched VM goes out of the private network, takes a detour
> through the external network, before it hits the nova metadata service
> that may be running on the very same physical host that the new guest
> has just been spawned on?
>

I believe this is exactly the same as if nova-network and nova-api were
running on two different servers, is it not? And if you wanted to run
nova-api and the quantum-l3-agent on the same server, it would be the same
as running nova-network and nova-api on the same server. And besides, the
amount of data fetched from the metadata service seems likely to be pretty
small (config values, SSH keys), so even if this detour was abnormal, I'm
not sure its the worst thing in the world. Am I missing something here?

>
> --
> You received this bug notification because you are a member of Netstack
> Core Developers, which is subscribed to quantum.
> https://bugs.launchpad.net/bugs/1079926
>
> Title:
> iptables NAT rules set by openstack-l3-agent are incomplete for AiO
> setups
>
> Status in OpenStack Quantum (virtual network service):
> Incomplete
>
> Bug description:
> In order to allow access to the meta...

Read more...

tags: added: sg-fw
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for quantum because there has been no activity for 60 days.]

Changed in quantum:
status: Incomplete → Expired
Revision history for this message
Hyunsun Moon (hyunsun-moon) wrote :

In my case metadata was not working due to the bridge iptables bypass config.
It was enabled by default.
I used Ubuntu 12.04 server and Folsom with namespace disabled.

By setting /proc/sys/net/bridge/bridge-nf-call-iptables to 0, metadata works fine with existing l3-agent settings.
This also affects SNAT rules for floating IP.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.