Upgrade router to L3 HA broke IPv6

Bug #1787919 reported by Tobias Urdin
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Brian Haley

Bug Description

When I disabled a router, changed it to L3 HA and enabled it again all the logic that was implemented in [1] did not seem to work.

Please see the thread on ML [2] for details.
The backup router had the net.ipv6.conf.qr-<interface>.accept_ra values for the qr interfaces (one for ipv4 and one for ipv6) set to 1.

On the active router the net.ipv6.conf.all.forwarding option was set to 0.

After removing SLAAC addresses on the backup router, setting accept_ra to 0 and enabling ipv6 forwarding on the active router it started working again.

Please let me know if you need anything to troubleshoot this here or on IRC (tobias-urdin).

Best regards
Tobias

[1] https://review.openstack.org/#/q/topic:bug/1667756+(status:open+OR+status:merged
[2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/133499.html

Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

It seems this is one of the known limitations [1]. I quoted the following from the official document:

  "Migrating a router from distributed only, HA only, or legacy to distributed HA is not supported at this time. The router must be created as distributed HA."

[1] https://docs.openstack.org/neutron/latest/admin/config-dvr-ha-snat.html#known-limitations

Hongbin Lu (hongbin.lu)
tags: added: l3-ha
tags: added: ipv6
Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

Actually, please disregard my comment (#1) because the limitations is for DVR but you doesn't seem to use DVR

Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

Several questions:

* Why not simply delete and re-create the router with the same configurations except changing the HA mode?
* When you said "When I disabled a router, changed it to L3 HA and enabled it again", could you provide the exact reproducing steps. For example, what happened before you reconfigure the router? what happened after?

Common information to confirm:

* Which version of openstack you are using?
* How did you install openstack (devstack? manual install?)
* What is your deployment topology?
* How did you configure L3 HA?
* Any relevant logs/config files to attach?

Revision history for this message
Tobias Urdin (tobias-urdin) wrote :

Thanks for your reply! I'll test this thoroughly when I have some time and will return with more details and a proper steps to reproduce the issue.

Hongbin Lu (hongbin.lu)
Changed in neutron:
status: New → Confirmed
Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

@Tobias Urdin,

Never mind. I can somehow reproduce the error. A few observations:

* Create a normal router, migrate to HA router, with ipv4 and ipv6 network (not work)
* Create a HA router, with ipv4 and ipv6 network (not work)
* Create a HA router, with ipv4 only network (work)
* Create a normal router, migrate to HA router, with ipv4 only network (work)

Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

This is how I reproduced it:

openstack network create selfservice2
openstack subnet create --subnet-range 198.51.100.0/24 \
  --network selfservice2 --dns-nameserver 8.8.4.4 selfservice2-v4
openstack subnet create --subnet-range fd00:198:51:100::/64 --ip-version 6 \
  --ipv6-ra-mode slaac --ipv6-address-mode slaac --network selfservice2 \
  --dns-nameserver 2001:4860:4860::8844 selfservice2-v6
openstack router create router2
openstack router add subnet router2 selfservice2-v4
openstack router add subnet router2 selfservice2-v6
openstack router set --external-gateway public router2

openstack server create --flavor m1.tiny --image cirros-0.3.5-x86_64-disk --nic net-id=selfservice2 selfservice-instance2

openstack floating ip create public
openstack server add floating ip selfservice-instance2 172.24.4.2 # change the floating ip address

# cannot ping the instance via the floating IP address

Hongbin Lu (hongbin.lu)
Changed in neutron:
importance: Undecided → High
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

@Hongbin: In bug title there is info that IPv6 is broken after migration, but in Your last comment I see that You can't ping floating IP which is IPv4.
Also in Your last comment there is no migration of router at all. So basically I'm not sure if You shouldn't report what You found as another bug :)

Revision history for this message
Tobias Urdin (tobias-urdin) wrote :

@Assaf Muller sorry I did not reply on the ML list, the email thread got lost in my mailbox. I reported this bug and @Hongbin Lu managed to reproduce this issue.

@Hongbin Lu Thanks for your help!

Miguel Lavalle (minsel)
tags: added: l3-dvr-backlog
Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

@Slawek Kaplonski,

Yes, there are two cases here:

* Create a fresh HA router doesn't work
* Migrate a router to HA doesn't work

This bug report is about the latter and I can reproduce it. When I did further trouble shooting, I found that creating a fresh HA router didn't work either. I tracked down both issues and it looks the root cause is the same (it is because the backup router sent RA packet which interrupts the switch's learning). Disabling RA on backup router and enable ipv6 forwarding as this bug mentioned resolved the issue.

Therefore, I think it is the same bug.

Miguel Lavalle (minsel)
Changed in neutron:
assignee: nobody → Miguel Lavalle (minsel)
Revision history for this message
Tobias Urdin (tobias-urdin) wrote :
Download full text (6.1 KiB)

I tried again, just to make sure I could reproduce it.

* Create a network, with one ipv4 and one ipv6 subnet
* Create a router with provider network outside (ipv4 and ipv6, upstream provides SLAAC) and two interfaces (one v4, one v6, SLAAC provided by openstack router) for the created network
* Spawned an instance on the network created in step 1, got one RFC1918 ipv4 and a public ipv6

To make it work:
* On active router: ip netns exec qrouter-<router id> sysctl -w net.ipv6.conf.all.forwarding=1

On the internal qr interface for the ipv6 subnet there is a ip conflict.

On master:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2310: ha-0046c005-09: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:40:89:64 brd ff:ff:ff:ff:ff:ff
    inet 169.254.192.3/18 brd 169.254.255.255 scope global ha-0046c005-09
       valid_lft forever preferred_lft forever
    inet 169.254.0.2/24 scope global ha-0046c005-09
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe40:8964/64 scope link
       valid_lft forever preferred_lft forever
2311: qg-081f5606-3a: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:40:5a:75 brd ff:ff:ff:ff:ff:ff
    inet 193.93.248.71/22 scope global qg-081f5606-3a
       valid_lft forever preferred_lft forever
    inet6 <hidden>:f816:3eff:fe40:5a75/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe40:5a75/64 scope link nodad
       valid_lft forever preferred_lft forever
2312: qr-769a975e-77: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:c4:d1:1d brd ff:ff:ff:ff:ff:ff
    inet 192.168.123.1/24 scope global qr-769a975e-77
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fec4:d11d/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
2313: qr-48b0176f-f8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:c4:af:32 brd ff:ff:ff:ff:ff:ff
    inet6 <hidden>:f816:3eff:fec4:af32/64 scope global mngtmpaddr dynamic
       valid_lft 85932sec preferred_lft 13932sec
    inet6 <hidden>::1/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fec4:af32/64 scope link
       valid_lft forever preferred_lft forever

[1465004.929480] IPv6: qr-769a975e-77: IPv6 duplicate address fe80::f816:3eff:fec4:d11d detected!
[1465036.480584] IPv6: qr-769a975e-77: IPv6 duplicate address <hidden>:f816:3eff:fec4:d11d detected!
[1465052.948421] IPv6: qr-769a975e-77: IPv6 duplicate address <hidden>:f816:3eff:fec4:d11d detected!

net.ipv4.conf.all.forwarding = 1
net.ipv6.conf.all.forwarding = 1
net.ipv4.conf.qg-081f5606-3a.forwarding = 1
net.ipv4.conf.qr-48b0176f-f8.forwardi...

Read more...

tags: removed: l3-dvr-backlog
Miguel Lavalle (minsel)
Changed in neutron:
assignee: Miguel Lavalle (minsel) → Brian Haley (brian-haley)
Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Brian Haley (brian-haley) wrote :

https://review.openstack.org/#/c/613396/ proposed, somehow didn't get linked here automatically.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/613396
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=b847cd02c56dc8fe654f4731306dc2b5493a62eb
Submitter: Zuul
Branch: master

commit b847cd02c56dc8fe654f4731306dc2b5493a62eb
Author: Brian Haley <email address hidden>
Date: Thu Oct 25 14:41:19 2018 -0400

    Enable 'all' IPv6 forwarding knob correctly

    When the external gateway is plugged and we enable IPv6
    forwarding on it, make sure the 'all' sysctl knob is also
    enabled, else IPv6 packets will not be forwarded. This
    seems to only affect HA routers that default to disabling
    this 'all' knob on creation.

    Also, when we are removing all the IPv6 addresses from a
    HA router internal interface, set 'accept_ra' to zero so
    it doesn't accidentally auto-configure an address. Set
    it back to one when adding them back.

    Re-homed newly added _wait_until_ipv6_forwarding_has_state()
    accordingly.

    Closes-bug: #1787919

    Change-Id: Ia1f311ee31d1479089685367a97bf13cf170b342

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/626934

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 14.0.0.0b1

This issue was fixed in the openstack/neutron 14.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.openstack.org/626934
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=dfedafe5f6b46c8f6cdc407736e7c865a1984401
Submitter: Zuul
Branch: stable/rocky

commit dfedafe5f6b46c8f6cdc407736e7c865a1984401
Author: Brian Haley <email address hidden>
Date: Thu Oct 25 14:41:19 2018 -0400

    Enable 'all' IPv6 forwarding knob correctly

    When the external gateway is plugged and we enable IPv6
    forwarding on it, make sure the 'all' sysctl knob is also
    enabled, else IPv6 packets will not be forwarded. This
    seems to only affect HA routers that default to disabling
    this 'all' knob on creation.

    Also, when we are removing all the IPv6 addresses from a
    HA router internal interface, set 'accept_ra' to zero so
    it doesn't accidentally auto-configure an address. Set
    it back to one when adding them back.

    Re-homed newly added _wait_until_ipv6_forwarding_has_state()
    accordingly.

    Conflicts:
        neutron/tests/functional/agent/l3/test_ha_router.py

    Closes-bug: #1787919

    Change-Id: Ia1f311ee31d1479089685367a97bf13cf170b342
    (cherry picked from commit b847cd02c56dc8fe654f4731306dc2b5493a62eb)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/631233

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/631234

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.openstack.org/631233
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=a5fe490e494df0beca647bddce93e6a0c3ab591c
Submitter: Zuul
Branch: stable/queens

commit a5fe490e494df0beca647bddce93e6a0c3ab591c
Author: Brian Haley <email address hidden>
Date: Thu Oct 25 14:41:19 2018 -0400

    Enable 'all' IPv6 forwarding knob correctly

    When the external gateway is plugged and we enable IPv6
    forwarding on it, make sure the 'all' sysctl knob is also
    enabled, else IPv6 packets will not be forwarded. This
    seems to only affect HA routers that default to disabling
    this 'all' knob on creation.

    Also, when we are removing all the IPv6 addresses from a
    HA router internal interface, set 'accept_ra' to zero so
    it doesn't accidentally auto-configure an address. Set
    it back to one when adding them back.

    Re-homed newly added _wait_until_ipv6_forwarding_has_state()
    accordingly.

    Conflicts:
        neutron/tests/functional/agent/l3/test_ha_router.py

    Closes-bug: #1787919

    Change-Id: Ia1f311ee31d1479089685367a97bf13cf170b342
    (cherry picked from commit b847cd02c56dc8fe654f4731306dc2b5493a62eb)
    (cherry picked from commit dfedafe5f6b46c8f6cdc407736e7c865a1984401)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/pike)

Reviewed: https://review.openstack.org/631234
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4f5c5ab4338e6a78b3743ea98e83ca7a026a1397
Submitter: Zuul
Branch: stable/pike

commit 4f5c5ab4338e6a78b3743ea98e83ca7a026a1397
Author: Brian Haley <email address hidden>
Date: Thu Oct 25 14:41:19 2018 -0400

    Enable 'all' IPv6 forwarding knob correctly

    When the external gateway is plugged and we enable IPv6
    forwarding on it, make sure the 'all' sysctl knob is also
    enabled, else IPv6 packets will not be forwarded. This
    seems to only affect HA routers that default to disabling
    this 'all' knob on creation.

    Also, when we are removing all the IPv6 addresses from a
    HA router internal interface, set 'accept_ra' to zero so
    it doesn't accidentally auto-configure an address. Set
    it back to one when adding them back.

    Re-homed newly added _wait_until_ipv6_forwarding_has_state()
    accordingly.

    Conflicts:
        neutron/tests/functional/agent/l3/test_ha_router.py

    Closes-bug: #1787919

    Change-Id: Ia1f311ee31d1479089685367a97bf13cf170b342
    (cherry picked from commit b847cd02c56dc8fe654f4731306dc2b5493a62eb)
    (cherry picked from commit dfedafe5f6b46c8f6cdc407736e7c865a1984401)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.7

This issue was fixed in the openstack/neutron 11.0.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 12.0.6

This issue was fixed in the openstack/neutron 12.0.6 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 13.0.3

This issue was fixed in the openstack/neutron 13.0.3 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.