devstack-networking-ext fails often because of OOM kills

Bug #2117288 reported by Alexey Stupnikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openstacksdk
Fix Released
Undecided
Alexey Stupnikov

Bug Description

openstacksdk-functional-devstack-networking-ext is quite unstable: almost half of runs fail. From logs it looks like at some point keystone starts shooting HTTP 500 because of various reasons. But after a second look it looks like OOM kills are to blame.

https://zuul.opendev.org/t/openstack/build/83578f702aed4ff48460bed81454a5bb/artifacts:

- first failed test [1]
- mysql was restarted [2]
- OOM killed few processes [3]

It looks like fails are pretty consistent, but OOM may kill different processes. So errors in test output may look different.

[1]
  https://de339b72b688803b5a43-80f76e47f83bff14c764cd8b2b7f1a08.ssl.cf2.rackcdn.com/openstack/83578f702aed4ff48460bed81454a5bb/job-output.txt
  2025-07-18 11:08:43.078479 | controller | {0} openstack.tests.functional.network.v2.test_floating_ip.TestFloatingIP.test_get_tags [12.887117s] ... FAILED
2025-07-18 11:08:43.527669 | controller | keystoneauth1.exceptions.http.InternalServerError: Internal Server Error (HTTP 500)

[2]
https://de339b72b688803b5a43-80f76e47f83bff14c764cd8b2b7f1a08.ssl.cf2.rackcdn.com/openstack/83578f702aed4ff48460bed81454a5bb/controller/logs/mysql/error_log.txt
2025-07-18T10:21:49.192932Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.42-0ubuntu0.24.04.2' socket: '/var/run/mysqld/mysqld.sock' port: 3306 (Ubuntu).
2025-07-18T11:08:44.436067Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.42-0ubuntu0.24.04.2) starting as process 127538

[3]
https://de339b72b688803b5a43-80f76e47f83bff14c764cd8b2b7f1a08.ssl.cf2.rackcdn.com/openstack/83578f702aed4ff48460bed81454a5bb/controller/logs/syslog.txt
Jul 18 11:08:41 npc27e2a2bfa514 kernel: Out of memory: Killed process 688 (systemd) total-vm:20156kB, anon-rss:768kB, file-rss:4608kB, shmem-rss:0kB, UID:1000 pgtables:84kB oom_score_adj:100
Jul 18 11:08:41 npc27e2a2bfa514 kernel: Out of memory: Killed process 689 ((sd-pam)) total-vm:20988kB, anon-rss:0kB, file-rss:1152kB, shmem-rss:0kB, UID:1000 pgtables:76kB oom_score_adj:100
Jul 18 11:08:41 npc27e2a2bfa514 kernel: Out of memory: Killed process 47491 (mysqld) total-vm:2851636kB, anon-rss:222960kB, file-rss:7808kB, shmem-rss:0kB, UID:112 pgtables:1912kB oom_score_adj:0

Tags: ci test
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

It looks like etcd leaks some memory: it is top RAM user... Generally speaking, more RAM on worker nodes looks like a solution.

description: updated
affects: python-openstackclient → openstacksdk
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to openstacksdk (master)
Changed in openstacksdk:
status: New → In Progress
Changed in openstacksdk:
assignee: nobody → Alexey Stupnikov (astupnikov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstacksdk (master)

Reviewed: https://review.opendev.org/c/openstack/openstacksdk/+/955399
Committed: https://opendev.org/openstack/openstacksdk/commit/c2465fee2d0d7814f03927b20babe051bcef26e0
Submitter: "Zuul (22348)"
Branch: master

commit c2465fee2d0d7814f03927b20babe051bcef26e0
Author: Alexey Stupnikov <email address hidden>
Date: Fri Jul 18 21:12:25 2025 +0200

    Increase swap allocation for devstack-networking

    devstack-networking tests are failing often because OOM kills
    different processes (mysqld mostly). This behavior is consistently
    reproducible.

    Closes-bug: #2117288
    Change-Id: Id04456762a5a82ecc451f382e8f671f2ae63af3f
    Signed-off-by: Alexey Stupnikov <email address hidden>

Changed in openstacksdk:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/openstacksdk 4.7.0

This issue was fixed in the openstack/openstacksdk 4.7.0 Flamingo release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.