keepalived fails to start after updating DVR-HA internal network MTU

Bug #2024381 reported by Anton Kurbatov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
In Progress
Medium
Anton Kurbatov

Bug Description

We got an issue when keepalived stops to be running after update MTU on the internal network of the DVR-HA router.
It turned out that the keepalived config has an interface from qrouter-ns although the keepalived process itself is running in snat-ns.

Here is a simple demo on the latest master branch:
$ openstack network create net1
$ openstack subnet create sub1 --network net1 --subnet-range 192.168.100.0/24
$ openstack router create r1 --distributed --ha
$ openstack router add subnet r1 sub1

Keepalived process is running and the config looks like:

$ ps axf | grep -w pid.keepalived
...
 130250 ? S 0:00 \_ keepalived -P -f /opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd/keepalived.conf -p /opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd.pid.keepalived -r /opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd.pid.keepalived-vrrp -D
$ cat /opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd/keepalived.conf
global_defs {
    notification_email_from <email address hidden>
    router_id neutron
}
vrrp_instance VR_60 {
    state BACKUP
    interface ha-77ee55dc-5c
    virtual_router_id 60
    priority 50
    garp_master_delay 60
    nopreempt
    advert_int 2
    track_interface {
        ha-77ee55dc-5c
    }
    virtual_ipaddress {
        169.254.0.60/24 dev ha-77ee55dc-5c
    }
$

Now update MTU of the internal network:

$ openstack network set net1 --mtu 1400
$ ps axf | grep -w pid.keepalived
 131097 pts/0 S+ 0:00 | \_ grep --color=auto -w pid.keepalived
$

$ ip netns exec snat-f7df848f-f168-4305-8ba2-a31902bdbbfd keepalived -t -f /opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd/keepalived.conf
(/opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd/keepalived.conf: Line 20) WARNING - interface qr-035f8095-76 for ip address 192.168.100.1/24 doesn't exist
(/opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd/keepalived.conf: Line 21) WARNING - interface qr-035f8095-76 for ip address fe80::f816:3eff:fe88:e922/64 doesn't exist
Non-existent interface specified in configuration
$
$ cat /opt/stack/data/neutron/ha_confs/f7df848f-f168-4305-8ba2-a31902bdbbfd/keepalived.conf
global_defs {
    notification_email_from <email address hidden>
    router_id neutron
}
vrrp_instance VR_60 {
    state BACKUP
    interface ha-77ee55dc-5c
    virtual_router_id 60
    priority 50
    garp_master_delay 60
    nopreempt
    advert_int 2
    track_interface {
        ha-77ee55dc-5c
    }
    virtual_ipaddress {
        169.254.0.60/24 dev ha-77ee55dc-5c
    }
    virtual_ipaddress_excluded {
        192.168.100.1/24 dev qr-035f8095-76
        fe80::f816:3eff:fe88:e922/64 dev qr-035f8095-76 scope link
    }
}$

$ ip netns exec snat-f7df848f-f168-4305-8ba2-a31902bdbbfd ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
10: ha-77ee55dc-5c: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether fa:16:3e:46:30:c4 brd ff:ff:ff:ff:ff:ff
$

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/886408

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/886409

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/886410

Changed in neutron:
importance: Undecided → Medium
assignee: nobody → Anton Kurbatov (akurbatov)
tags: added: l3-dvr-backlog
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/886410
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.