Configure DVR fip router fail when l3 agent restarts and it cannot be recovered

Bug #1564757 reported by RaoFei on 2016-04-01
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Undecided
RaoFei

Bug Description

* Issue Description
When creating a fip and l3 agent is restarted, if the veth devices(rtr and fip) are created but the ip addresses are not configured. it will cause the IP address cannot be configure no more, even if the l3 agent restart.

* Pre-conditions
Create a fip and the fip status is active. The external public network can access fip successfully.

* Step-by-step and Result
1. loop create and delete fip on neutron-server side.
2. loop restart l3 agent.
3. the FIP status becomes ERROR. The external public network access fip unsuccessfully.

* Version
  from Juno->Latest

* Analysis
 The processing of the link from rtr to fip is 1) if veth is not existed, then create it. and then configure ip addresses. 2) if veth is existed, then configure router.

if veth device is not existed:
    create veth devices -------Here l3 implement stopped. the ip address are not configured
    configure veth devices ip address

create router -----Here configure router failed due to no ip address on the veth interfaces
....

    def create_rtr_2_fip_link(self, ri):
        """Create interface between router and Floating IP namespace."""
        LOG.debug("Create FIP link interfaces for router %s", ri.router_id)
        rtr_2_fip_name = self.get_rtr_ext_device_name(ri.router_id)
        fip_2_rtr_name = self.get_int_device_name(ri.router_id)
        fip_ns_name = self.get_name()

        # add link local IP to interface
        if ri.rtr_fip_subnet is None:
            ri.rtr_fip_subnet = self.local_subnets.allocate(ri.router_id)
        rtr_2_fip, fip_2_rtr = ri.rtr_fip_subnet.get_pair()
        ip_wrapper = ip_lib.IPWrapper(namespace=ri.ns_name)
        device_exists = ip_lib.device_exists(rtr_2_fip_name,
                                             namespace=ri.ns_name)
        if not device_exists:
            int_dev = ip_wrapper.add_veth(rtr_2_fip_name,
                                          fip_2_rtr_name,
                                          fip_ns_name)
            self._internal_ns_interface_added(str(rtr_2_fip),
                                              rtr_2_fip_name,
                                              ri.ns_name)
            self._internal_ns_interface_added(str(fip_2_rtr),
                                              fip_2_rtr_name,
                                              fip_ns_name)
            if self.agent_conf.network_device_mtu:
                int_dev[0].link.set_mtu(self.agent_conf.network_device_mtu)
                int_dev[1].link.set_mtu(self.agent_conf.network_device_mtu)
            int_dev[0].link.set_up()
            int_dev[1].link.set_up()

        # add default route for the link local interface
        device = ip_lib.IPDevice(rtr_2_fip_name, namespace=ri.ns_name)
        device.route.add_gateway(str(fip_2_rtr.ip), table=FIP_RT_TBL)
        #setup the NAT rules and chains
        ri._handle_fip_nat_rules(rtr_2_fip_name)

RaoFei (milo-frao) wrote :

I will commit a patch for this issue. If you are also follow with interest l3-dvr, please raise your comment.

Changed in neutron:
assignee: nobody → RaoFei (milo-frao)

I will take a look at it.

Do you have the L3 agent logs, when this error occurs. If you have the logs, can you upload the logs to pastebin and paste the link here.

This may be a duplicate of 1566383

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers