Ussuri deployment returns RA responding to RS, but Wallaby/Xena doesn't

Bug #1965883 reported by Nobuto Murata
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
charm-ovn-chassis
New
Undecided
Unassigned

Bug Description

We don't know where this difference comes from yet so filing an issue to a charm for the time being.

When a dual stack tenant network is created with Neutron/OVN with Wallaby/Xena cloud archive, Router advertisement is not returned to a VM somehow and a VM has to wait for 10 min ish to get the RA packet with the default interval. However, with Ussuri (focal) deployment, RA packet follows right after the router solicitation packet.

Not getting an IPv6 address for 10 min is an enough penalty in terms of network connectivity so it's considered as a regression.

[focal-ussuri]

sudo radvdump &
sudo systemctl restart systemd-networkd.service
-> immediate output
#
# radvd configuration generated by radvdump 2.17
# based on Router Advertisement from fe80::f816:3eff:fe62:d3bf
# received by interface ens3
#

interface ens3
{
        AdvSendAdvert on;
        # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump
        AdvManagedFlag off;
        AdvOtherConfigFlag off;
        AdvReachableTime 0;
        AdvRetransTimer 0;
        AdvCurHopLimit 255;
        AdvDefaultLifetime 65535;
        AdvHomeAgentFlag off;
        AdvDefaultPreference medium;
        AdvSourceLLAddress on;
        AdvLinkMTU 1442;

        prefix fc00:aa:aa:aa::/64
        {
                AdvValidLifetime infinity; # (0xffffffff)
                AdvPreferredLifetime infinity; # (0xffffffff)
                AdvOnLink on;
                AdvAutonomous on;
                AdvRouterAddr off;
        }; # End of prefix definition

}; # End of interface definition

[focal-xena]
sudo radvdump &
sudo systemctl restart systemd-networkd.service
-> no immediate output, but will get RA within 10 min ish

====

A test case should be pretty standard following:
https://docs.openstack.org/neutron/latest/admin/config-ipv6.html

openstack network create internal

openstack subnet create \
    --network internal \
    --subnet-range 10.5.5.0/24 \
    internal_subnet

openstack subnet create \
    --network internal \
    --ip-version 6 \
    --ipv6-ra-mode slaac \
    --ipv6-address-mode slaac \
    --subnet-range fc00:aa:aa:aa::/64 \
    internal_subnet_v6

openstack router create provider-router
openstack router set --external-gateway ext_net provider-router
openstack router add subnet provider-router internal_subnet
openstack router add subnet provider-router internal_subnet_v6

openstack server create \
    --wait \
    --flavor m1.custom \
    --image auto-sync/ubuntu-focal-20.04-amd64-server-20220321-disk1.img \
    --network internal \
    --key-name mykey \
    test-dual-stack

Revision history for this message
Nobuto Murata (nobuto) wrote :

Subscribing ~field-high.

Revision history for this message
Nobuto Murata (nobuto) wrote :

I was deploying multiple releases but will deploy the testbed with focal-xena so I should be able to offer any info to be requested.

Revision history for this message
Nobuto Murata (nobuto) wrote :
Download full text (4.4 KiB)

This is what I see from a VM on top of Xena deployment - no RA until the periodic interval:

10:50:07.933169 fa:16:3e:d9:ca:78 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (flowlabel 0x8f5ed, hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fed9:ca78 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
          source link-address option (1), length 8 (1): fa:16:3e:d9:ca:78
10:50:12.282979 fa:16:3e:d9:ca:78 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (flowlabel 0x8f5ed, hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fed9:ca78 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
          source link-address option (1), length 8 (1): fa:16:3e:d9:ca:78
10:50:20.680418 fa:16:3e:d9:ca:78 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (flowlabel 0x8f5ed, hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fed9:ca78 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
          source link-address option (1), length 8 (1): fa:16:3e:d9:ca:78
10:50:36.651279 fa:16:3e:d9:ca:78 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (flowlabel 0x8f5ed, hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fed9:ca78 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
          source link-address option (1), length 8 (1): fa:16:3e:d9:ca:78
10:51:08.183176 fa:16:3e:d9:ca:78 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (flowlabel 0x8f5ed, hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fed9:ca78 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
          source link-address option (1), length 8 (1): fa:16:3e:d9:ca:78
10:52:08.250195 fa:16:3e:d9:ca:78 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (flowlabel 0x8f5ed, hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fed9:ca78 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
          source link-address option (1), length 8 (1): fa:16:3e:d9:ca:78
10:52:50.716674 fa:16:3e:bc:47:96 > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 118: (hlim 255, next-header ICMPv6 (58) payload length: 64) fe80::f816:3eff:febc:4796 > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 64
        hop limit 255, Flags [none], pref medium, router lifetime 65535s, reachable time 0ms, retrans timer 0ms
          source link-address option (1), length 8 (1): fa:16:3e:bc:47:96
          mtu option (5), length 8 (1): 1442
          prefix info option (3), length 32 (4): fc00:aa:aa:aa::/64, Flags [onlink, auto], valid time infinity, pref. time infinity
10:52:50.801190 fa:16:3e:d9:ca:78 > 33:33:ff:d9:ca:78, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ffd9:ca78: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fc00:aa:aa:aa:f816:3eff:fed9:ca78
          unknown option (14), length 8 (1):
          0x0000: 8243 99fc 248e
10:59:45.754605 fe:16:3e:d9:ca:78 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::fc16:3eff:fed9:ca78 > ff02::2:...

Read more...

Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
Alexander Litvinov (alitvinov) wrote (last edit ):

Just checked on focal/xena

openstack subnet create --network net-ipv6 \
  --ip-version 6 \
  --subnet-range 2620:128:e082:4008::/64 \
  --ipv6-address-mode slaac \
  --ipv6-ra-mode slaac \
  subnet1

and

openstack router add subnet router subnet1

I turned on tcpdump for router advertisement inside the VM (which already had ipv4 address on the ens3 interface) this way:
sudo tcpdump -vvvv -ttt -i any icmp6 and 'ip6[40] = 134'

then after hot-plug of the interface with subnet1 to the VM and

sudo ip link set ens8 up

almost instantly I get router advertisement and IP on subnet1 on ens8:

00:01:34.253736 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 64) _gateway > test62: [icmp6 sum ok] ICMP6, router advertisement, length 64
        hop limit 255, Flags [none], pref medium, router lifetime 65535s, reachable time 0ms, retrans timer 0ms
          source link-address option (1), length 8 (1): fa:16:3e:1e:d5:f2
            0x0000: fa16 3e1e d5f2
          mtu option (5), length 8 (1): 8942
            0x0000: 0000 0000 22ee
          prefix info option (3), length 32 (4): 2620:128:e082:4008::/64, Flags [onlink, auto], valid time infinity, pref. time infinity
            0x0000: 40c0 ffff ffff ffff ffff 0000 0000 2620
            0x0010: 0128 e082 4008 0000 0000 0000 0000

Revision history for this message
Alexander Litvinov (alitvinov) wrote (last edit ):

Also when I do

sudo radvdump
sudo ip link set ens8 down
sudo ip link set ens8 up

I receive the advertisement in just 2-3-4 seconds

same with
sudo radvdump
sudo ip link set ens8 down
sudo systemctl restart systemd-networkd.service

interface ens8
{
        AdvSendAdvert on;
        # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump
        AdvManagedFlag off;
        AdvOtherConfigFlag off;
        AdvReachableTime 0;
        AdvRetransTimer 0;
        AdvCurHopLimit 255;
        AdvDefaultLifetime 65535;
        AdvHomeAgentFlag off;
        AdvDefaultPreference medium;
        AdvSourceLLAddress on;
        AdvLinkMTU 8942;

        prefix 2620:128:e082:4008::/64
        {
                AdvValidLifetime infinity; # (0xffffffff)
                AdvPreferredLifetime infinity; # (0xffffffff)
                AdvOnLink on;
                AdvAutonomous on;
                AdvRouterAddr off;
        }; # End of prefix definition

}; # End of interface definition
#

Revision history for this message
Nobuto Murata (nobuto) wrote :

Unsubscribing ~field-high for now since the env doesn't have the issue after tweaking other set up such as network configs. The issue with the testbed is real but not a field issue for now.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote (last edit ):

Thanks for testing.

OVN should periodically send RAs based on the options specified in the northbound DB (send_periodic, max_interval, min_interval).

https://github.com/ovn-org/ovn/commit/c04c004efaf64b77460de5338092e642385e582d
https://github.com/ovn-org/ovn/commit/ec5bcc68b34e73b8541f3143dd1f69a8e884cbef

https://github.com/ovn-org/ovn/blob/ed9bb4d59f78d14300d27d68fd2c4ec4621f2256/controller/pinctrl.c#L3818-L3893
https://github.com/ovn-org/ovn/blob/ed9bb4d59f78d14300d27d68fd2c4ec4621f2256/controller/pinctrl.c#L3404

The Logical_Router_Port table in the ovn-nb may contain the following based on the docs:

https://man7.org/linux/man-pages/man5/ovn-nb.5.html#Logical_Router_Port_TABLE
         ipv6_ra_configs : send_periodic
                                     optional string
         ipv6_ra_configs : max_interval
                                     optional string
         ipv6_ra_configs : min_interval
                                     optional string

On an ovn-central unit the following should show LRPs:

ovn-nbctl list Logical_Router_Port

====

As far as Neutron is concerned, `send_periodic` is applied only to internal networks where as intervals are not explicitly set:

https://github.com/openstack/neutron/blob/dd55f1acd36dee4ab4ec806a694b5efcc8e53cb8/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L1169-L1175
                ipv6_ra_configs['send_periodic'] = 'true'
                if is_gw_port and utils.is_provider_network(net):
                    ipv6_ra_configs['send_periodic'] = 'false'

https://github.com/openstack/neutron/blob/dd55f1acd36dee4ab4ec806a694b5efcc8e53cb8/neutron/common/ovn/utils.py#L526-L527
def is_provider_network(network):
    return network.get(external_net.EXTERNAL, False)

Intervals for RAs in OVN are both set per RFC 4861 section 6.2.1 (https://datatracker.ietf.org/doc/html/rfc4861#section-6.2.1):

https://github.com/ovn-org/ovn/blob/ed9bb4d59f78d14300d27d68fd2c4ec4621f2256/controller/pinctrl.c#L3588-L3591
    config->max_interval = smap_get_int(&pb->options, "ipv6_ra_max_interval",
            ND_RA_MAX_INTERVAL_DEFAULT);
    config->min_interval = smap_get_int(&pb->options, "ipv6_ra_min_interval",
            nd_ra_min_interval_default(config->max_interval));

https://github.com/openvswitch/ovs/blob/31b467a751892df2dd938338d91d39e995a6a18c/lib/packets.h#L1083
#define ND_RA_MAX_INTERVAL_DEFAULT 600

https://github.com/openvswitch/ovs/blob/31b467a751892df2dd938338d91d39e995a6a18c/lib/packets.h#L1085-L1089
static inline int
nd_ra_min_interval_default(int max)
{
    return max >= 9 ? max / 3 : max * 3 / 4;
}

So it should be min_interval = 600 / 3 seconds = 200 seconds

Revision history for this message
Nobuto Murata (nobuto) wrote :

To be clear, I'm not talking about the periodic RA packets per se. But no RA response to RS or those interaction is dropped somewhere in between.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.