radvd seems to crash when ipv4 addresses are supplied as nameservers to ipv6 subnets

Bug #2036877 reported by Sven Kieske
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Brian Haley

Bug Description

I'll copy from this report, please notice that I'm NOT the original reporter!:

https://bugs.launchpad.net/kolla-ansible/+bug/2033980/comments/8

Before cleaning the PID file, I did take a look at the config of radvd:

```
$ cat /var/lib/neutron/ra/aee91f41-1945-40b4-b72f-8be2eb369b44.radvd.conf
interface qr-caa16d7e-26
{
   AdvSendAdvert on;
   MinRtrAdvInterval 30;
   MaxRtrAdvInterval 100;
   AdvLinkMTU 1450;

   RDNSS 2a02:74a0:x:0::53 10.40.3.53 2a02:74a0:x:0::54 {};

   prefix 2a02:74a0:x:y::/64
   {
        AdvOnLink on;
        AdvAutonomous on;
   };

   route fe80::a9fe:a9fe/128 {
   };
};
```

We've been configuring the router with terraform, assigning the ipv4 resolvers to the IPv4 subnet and the IPv6 resolvers to the IPv6 subnet.

After deleting the router, adjusting the subnets (no resolvers on v4, only ipv6 resolvers on ipv6), and recreating the router, radvd is now active and everything's fine.

It seems that due to misconfiguration (and incomplete template parsing), IPv4 nameservers ended up in the config of radvd, which failed to start. Neutron was then unable to clean up the pidfile, thus failing to start radvd again.

tags: added: ipv6
Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
Changed in neutron:
assignee: nobody → Brian Haley (brian-haley)
Revision history for this message
Brian Haley (brian-haley) wrote :

So I just tried to recreate this, but can't add an IPv4 DNS nameserver to an IPv6 subnet (which I suspected):

$ openstack subnet show ipv6-private-subnet
+----------------------+--------------------------------------------------------+
| Field | Value |
+----------------------+--------------------------------------------------------+
| allocation_pools | fd38:1b17:5050::2-fd38:1b17:5050:0:ffff:ffff:ffff:ffff |
| cidr | fd38:1b17:5050::/64 |
| created_at | 2023-10-23T22:04:40Z |
| description | |
| dns_nameservers | |
| dns_publish_fixed_ip | None |
| enable_dhcp | True |
| gateway_ip | fd38:1b17:5050::1 |
| host_routes | |
| id | e28a657c-f9a7-4b12-962a-7a0e85b8d737 |
| ip_version | 6 |
| ipv6_address_mode | slaac |
| ipv6_ra_mode | slaac |
| name | ipv6-private-subnet |
| network_id | 9a5cd74f-7b79-4143-ae46-dbdf27dfe6d2 |
| project_id | d2461f0a315948f5a5c0751690462fba |
| revision_number | 0 |
| segment_id | None |
| service_types | |
| subnetpool_id | 2bc9f4d7-6d9a-474c-b067-301d721a77c1 |
| tags | |
| updated_at | 2023-10-23T22:04:40Z |
+----------------------+--------------------------------------------------------+

$ openstack subnet set --dns-nameserver 8.8.8.8 ipv6-private-subnet
BadRequestException: 400: Client Error for url: http://172.16.0.145:9696/networking/v2.0/subnets/e28a657c-f9a7-4b12-962a-7a0e85b8d737, Invalid input for operation: dns_nameserver '8.8.8.8' does not match the ip_version '6'.

And since the code configuring the radvd.conf file is basing it's values on only SLAAC subnets, I'm not sure how this bug could happen unless the file was manually edited.

Anton - can you give any info on how this happened in your config? Thanks.

Revision history for this message
Anton Dollmaier (antondollmaier) wrote :
Download full text (3.2 KiB)

Hi Brian,

Apologies for the delay.

I've managed to create a terraform manifest with a minimal example:

```
(...)
variable "dns_ip" {
  type = list(string)
  default = ["10.40.3.53", "10.40.3.54", "10.40.3.55", "2a02:74a0:x::53", "2a02:74a0:x::54", "2a02:74a0:x::55"]
}
resource "openstack_networking_subnet_v2" "radvd_generic6" {
  name = "radvd_generic6"
  network_id = openstack_networking_network_v2.radvd_generic.id
  subnetpool_id = data.openstack_networking_subnetpool_v2.public6.id
  prefix_length = 64
  ip_version = 6
  ipv6_address_mode = "slaac"
  ipv6_ra_mode = "slaac"
  dns_nameservers = var.dns_ip
}
```

Please see the attached manifest.tf.txt for the complete code.

This results in the following router to be created:

```
openstack_networking_router_v2.radvd_generic6: Creation complete after 6s [id=78898eaa-64fd-4318-8b68-0861e0f66910]
```

Checking in the neutron_l3_agent container gives me the faulty config:

```
(neutron-l3-agent)[neutron@ryan-davaz /]$ cat /var/lib/neutron/ra/78898eaa-64fd-4318-8b68-0861e0f66910.radvd.conf
interface qr-2b340d1a-1d
{
   AdvSendAdvert on;
   MinRtrAdvInterval 30;
   MaxRtrAdvInterval 100;
   AdvLinkMTU 1450;

   RDNSS 10.40.3.53 10.40.3.54 10.40.3.55 {};

   prefix 2a02:74a0:a008:200f::/64
   {
        AdvOnLink on;
        AdvAutonomous on;
   };

   route fe80::a9fe:a9fe/128 {
   };
};
```

As such, the radvd daemon is also not running.

Now to the exciting part:

If I edit the IPv6 subnet via UI, I get the first complaint: six nameservers are too many.

After removing the last, I get the expected response that the nameserver "10.40.3.53" does not match the ip_version '6'.

This also happens when adjusting the variable in the terraform manifest and re-applying:

```
│ Error: Error updating OpenStack Neutron openstack_networking_subnet_v2 d9fc6837-9341-4e83-b4ed-64f8cfd9943d: Bad request with: [PUT http://stack:9696/v2.0/subnets/d9fc6837-9341-4e83-b4ed-64f8cfd9943d], error message: {"NeutronError": {"type": "InvalidInput", "message": "Invalid input for operation: dns_nameserver '2a02:74a0:x::53' does not match the ip_version '4'.", "detail": ""}}

│ with openstack_networking_subnet_v2.radvd_generic,
│ on 020-network.tf line 49, in resource "openstack_networking_subnet_v2" "radvd_generic":
│ 49: resource "openstack_networking_subnet_v2" "radvd_generic" {



│ Error: Error updating OpenStack Neutron openstack_networking_subnet_v2 1900297d-ca47-40ba-aba8-caeaff129c61: Bad request with: [PUT http://stack:9696/v2.0/subnets/1900297d-ca47-40ba-aba8-caeaff129c61], error message: {"NeutronError": {"type": "InvalidInput", "message": "Invalid input for operation: dns_nameserver '10.40.3.53' does not match the ip_version '6'.", "detail": ""}}

│ with openstack_networking_subnet_v2.radvd_generic6,
│ on 020-network.tf line 58, in resource "openstack_networking_subnet_v2" "radvd_generic6":
│ 58: resource "openstack_networking_subnet_v2" "radvd_generic6" {


```

It seems that - upon creation - the input isn't fully validated compared to an update request.

If the API correctly rejects the faulty input, radvd won't receive the ...

Read more...

Revision history for this message
Brian Haley (brian-haley) wrote :

Anton - thanks for the additional info, it helped me find where the issue is. When a subnetpool is used, not all the other arguments are verified so it could wind-up in this state. I'll push a change soon.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/900240

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/900240
Committed: https://opendev.org/openstack/neutron/commit/2f91d330dae0127be1adb98f3d6d1fd49745f25d
Submitter: "Zuul (22348)"
Branch: master

commit 2f91d330dae0127be1adb98f3d6d1fd49745f25d
Author: Brian Haley <email address hidden>
Date: Mon Nov 6 15:03:50 2023 -0500

    Correctly validate subnet arguments when using a subnetpool

    When creating a subnet using a subnetpool, we were
    failing to validate all the passed API arguments in
    the dictionary, leading to a case where you could
    specify an invalid DNS nameserver. For example,
    using an IPv4 nameserver on an IPv6 subnet. This
    could cause daemons the l3-agent starts, like radvd,
    to fail to start correctly, leading to a loss of
    connectivity.

    Specifying a subnet by cidr without a subnetpool
    did already correctly fail with an IP version
    mismatch error, this is just an edge case that
    was never tested.

    Since _validate_subnet() was called in so many places
    it was moved to a common location and is only not
    called for IPv6 prefix-delegation subnets.

    Closes-bug: #2036877
    Change-Id: I6302e9a373cf93e706cec10f87c3beaf632a0391

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/neutron/+/903141

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/903147

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/neutron/+/903148

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/903141
Committed: https://opendev.org/openstack/neutron/commit/cd9366b9ba100b507d81810ceaac32255e5ea94d
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit cd9366b9ba100b507d81810ceaac32255e5ea94d
Author: Brian Haley <email address hidden>
Date: Mon Nov 6 15:03:50 2023 -0500

    Correctly validate subnet arguments when using a subnetpool

    When creating a subnet using a subnetpool, we were
    failing to validate all the passed API arguments in
    the dictionary, leading to a case where you could
    specify an invalid DNS nameserver. For example,
    using an IPv4 nameserver on an IPv6 subnet. This
    could cause daemons the l3-agent starts, like radvd,
    to fail to start correctly, leading to a loss of
    connectivity.

    Specifying a subnet by cidr without a subnetpool
    did already correctly fail with an IP version
    mismatch error, this is just an edge case that
    was never tested.

    Since _validate_subnet() was called in so many places
    it was moved to a common location and is only not
    called for IPv6 prefix-delegation subnets.

    Closes-bug: #2036877
    Change-Id: I6302e9a373cf93e706cec10f87c3beaf632a0391
    (cherry picked from commit 2f91d330dae0127be1adb98f3d6d1fd49745f25d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/903147
Committed: https://opendev.org/openstack/neutron/commit/f2ec3a6cec77a0d77b403394f1e8a5757579a0db
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit f2ec3a6cec77a0d77b403394f1e8a5757579a0db
Author: Brian Haley <email address hidden>
Date: Mon Nov 6 15:03:50 2023 -0500

    Correctly validate subnet arguments when using a subnetpool

    When creating a subnet using a subnetpool, we were
    failing to validate all the passed API arguments in
    the dictionary, leading to a case where you could
    specify an invalid DNS nameserver. For example,
    using an IPv4 nameserver on an IPv6 subnet. This
    could cause daemons the l3-agent starts, like radvd,
    to fail to start correctly, leading to a loss of
    connectivity.

    Specifying a subnet by cidr without a subnetpool
    did already correctly fail with an IP version
    mismatch error, this is just an edge case that
    was never tested.

    Since _validate_subnet() was called in so many places
    it was moved to a common location and is only not
    called for IPv6 prefix-delegation subnets.

    Closes-bug: #2036877
    Change-Id: I6302e9a373cf93e706cec10f87c3beaf632a0391
    (cherry picked from commit 2f91d330dae0127be1adb98f3d6d1fd49745f25d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/903148
Committed: https://opendev.org/openstack/neutron/commit/74a3262a8a96a870213ef4b52667ed2179246f9e
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 74a3262a8a96a870213ef4b52667ed2179246f9e
Author: Brian Haley <email address hidden>
Date: Mon Nov 6 15:03:50 2023 -0500

    Correctly validate subnet arguments when using a subnetpool

    When creating a subnet using a subnetpool, we were
    failing to validate all the passed API arguments in
    the dictionary, leading to a case where you could
    specify an invalid DNS nameserver. For example,
    using an IPv4 nameserver on an IPv6 subnet. This
    could cause daemons the l3-agent starts, like radvd,
    to fail to start correctly, leading to a loss of
    connectivity.

    Specifying a subnet by cidr without a subnetpool
    did already correctly fail with an IP version
    mismatch error, this is just an edge case that
    was never tested.

    Since _validate_subnet() was called in so many places
    it was moved to a common location and is only not
    called for IPv6 prefix-delegation subnets.

    Closes-bug: #2036877
    Change-Id: I6302e9a373cf93e706cec10f87c3beaf632a0391
    (cherry picked from commit 2f91d330dae0127be1adb98f3d6d1fd49745f25d)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 24.0.0.0b1

This issue was fixed in the openstack/neutron 24.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/neutron/+/905680

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/yoga)

Change abandoned by "Rodolfo Alonso <email address hidden>" on branch: stable/yoga
Review: https://review.opendev.org/c/openstack/neutron/+/905680

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.2.1

This issue was fixed in the openstack/neutron 21.2.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.2.0

This issue was fixed in the openstack/neutron 22.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.2.0

This issue was fixed in the openstack/neutron 23.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.