Neutron fails to respawn radvd due to corrupt pid file
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla-ansible |
Invalid
|
Undecided
|
Unassigned | ||
neutron |
Fix Released
|
Medium
|
Brian Haley |
Bug Description
**Bug Report**
What happened:
I have had issues periodically where radvd seems to die and neutron is not able to respawn it. I'm not sure why it dies.
In my neutron-
```
2023-09-03 14:37:07.514 16 ERROR neutron.
2023-09-03 14:37:07.514 16 ERROR neutron.
2023-09-03 14:37:07.514 16 WARNING neutron.
2023-09-03 14:37:07.514 16 ERROR neutron.
2023-09-03 14:37:07.762 16 ERROR neutron.
```
Inspecting the pid file, it appears to have 2 pids, one on each line:
```
$ docker exec -it neutron_l3_agent cat /var/lib/
853
1161
```
Deleting the file then properly respawns radvd:
```
2023-09-03 14:38:07.515 16 ERROR neutron.
2023-09-03 14:38:07.516 16 WARNING neutron.
```
What you expected to happen:
Radvd is respawned without needing manual intervention. Likely what is meant to happen is neutron should write the pid to the file, whereas instead it appends it. I'm not sure if this is a kolla issue or a neutron issue.
How to reproduce it (minimal and precise): Unsure, I'm not sure how radvd ends up dying in the first place. You could likely reproduce this by deploying kolla-ansible and then manually killing radvd.
**Environment**:
* OS (e.g. from /etc/os-release):
NAME="Rocky Linux"
VERSION="9.2 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.2"
PLATFORM_
PRETTY_NAME="Rocky Linux 9.2 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-
CPE_NAME=
HOME_URL="https:/
BUG_REPORT_URL="https:/
SUPPORT_
ROCKY_SUPPORT_
ROCKY_SUPPORT_
REDHAT_
REDHAT_
* Kernel (e.g. `uname -a`):
Linux lon1 5.14.0-
* Docker version if applicable (e.g. `docker version`):
Client: Docker Engine - Community
Version: 24.0.5
API version: 1.43
Go version: go1.20.6
Git commit: ced0996
Built: Fri Jul 21 20:36:54 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 24.0.5
API version: 1.43 (minimum version 1.12)
Go version: go1.20.6
Git commit: a61e2b4
Built: Fri Jul 21 20:35:17 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.22
GitCommit: 8165feabfdfe38c
runc:
Version: 1.1.8
GitCommit: v1.1.8-0-g82f18fe
docker-init:
Version: 0.19.0
GitCommit: de40ad0
* Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release):
16.1.0 (stable/2023.1)
* Docker image Install type (source/binary): Default installed by kolla-ansible
* Docker image distribution: rocky
* Are you using official images from Docker Hub or self built? official
* If self built - Kolla version and environment used to build: not applicable
* Share your inventory file, globals.yml and other configuration files if relevant: Likely not relevant.
Changed in neutron: | |
importance: | Undecided → Medium |
tags: | added: ipv6 |
tags: | added: low-hanging-fruit |
Changed in neutron: | |
assignee: | nobody → Brian Haley (brian-haley) |
an operator just reported this to us. It seems to happen if you provide 5 DNS Servers to a neutron IPv6 network.
Figuring out if this is really a kolla-ansible bug or should be reported to neutron instead.