Ubuntu
nova package

Bug #2024258
Activity log

Activity log for bug #2024258

Date	Who	What changed	Old value	New value	Message
2023-06-16 18:31:23	melanie witt	bug			added bug
2023-06-16 18:33:36	melanie witt	description	Observed downstream in a large scale cluster with constant create/delete server activity and hundreds of thousands of deleted instances rows. Currently, we archive deleted rows in batches of max_rows parents + their child rows in a single database transaction. Doing it that way limits how high a value of max_rows can be specified by the caller because of the size of the database transaction it could generate. For example, in a large scale deployment with hundreds of thousands of deleted rows and constant server creation and deletion activity, a value of max_rows=1000 might exceed the database's configured maximum packet size or timeout due to a database deadlock, forcing the operator to use a much lower max_rows value like 100 or 50. And when the operator has e.g. 500,000 deleted instances rows (and millions of deleted rows total) they are trying to archive, being forced to use a max_rows value several orders of magnitude lower than the number of rows they need to archive is a poor user experience and makes it unclear if archive progress is actually being made.	Observed downstream in a large scale cluster with constant create/delete server activity and hundreds of thousands of deleted instances rows. Currently, we archive deleted rows in batches of max_rows parents + their child rows in a single database transaction. Doing it that way limits how high a value of max_rows can be specified by the caller because of the size of the database transaction it could generate. For example, in a large scale deployment with hundreds of thousands of deleted rows and constant server creation and deletion activity, a value of max_rows=1000 might exceed the database's configured maximum packet size or timeout due to a database deadlock, forcing the operator to use a much lower max_rows value like 100 or 50. And when the operator has e.g. 500,000 deleted instances rows (and millions of deleted rows total) they are trying to archive, being forced to use a max_rows value several orders of magnitude lower than the number of rows they need to archive is a poor user experience and also makes it unclear if archive progress is actually being made.
2023-06-16 18:34:22	melanie witt	nominated for series		nova/xena
2023-06-16 18:34:22	melanie witt	bug task added		nova/xena
2023-06-16 18:34:22	melanie witt	nominated for series		nova/antelope
2023-06-16 18:34:22	melanie witt	bug task added		nova/antelope
2023-06-16 18:34:22	melanie witt	nominated for series		nova/zed
2023-06-16 18:34:22	melanie witt	bug task added		nova/zed
2023-06-16 18:34:22	melanie witt	nominated for series		nova/wallaby
2023-06-16 18:34:22	melanie witt	bug task added		nova/wallaby
2023-06-16 18:34:22	melanie witt	nominated for series		nova/yoga
2023-06-16 18:34:22	melanie witt	bug task added		nova/yoga
2023-06-16 19:08:04	OpenStack Infra	nova: status	New	In Progress
2023-08-21 13:43:46	Christian Rohmann	bug			added subscriber Christian Rohmann
2023-10-24 23:05:33	melanie witt	nova: status	In Progress	Fix Released
2023-10-24 23:06:38	melanie witt	nova/antelope: status	New	In Progress
2023-10-24 23:07:12	melanie witt	nova/zed: status	New	In Progress
2023-10-24 23:07:41	melanie witt	nova/yoga: status	New	In Progress
2023-10-24 23:08:10	melanie witt	nova/xena: status	New	In Progress
2023-10-24 23:08:38	melanie witt	nova/wallaby: status	New	In Progress
2024-05-28 09:32:30	Chengen Du	bug task added		nova (Ubuntu)
2024-05-28 09:32:52	Chengen Du	nominated for series		Ubuntu Focal
2024-05-28 09:32:52	Chengen Du	bug task added		nova (Ubuntu Focal)
2024-05-28 09:32:52	Chengen Du	nominated for series		Ubuntu Jammy
2024-05-28 09:32:52	Chengen Du	bug task added		nova (Ubuntu Jammy)
2024-05-28 09:47:00	Chengen Du	description	Observed downstream in a large scale cluster with constant create/delete server activity and hundreds of thousands of deleted instances rows. Currently, we archive deleted rows in batches of max_rows parents + their child rows in a single database transaction. Doing it that way limits how high a value of max_rows can be specified by the caller because of the size of the database transaction it could generate. For example, in a large scale deployment with hundreds of thousands of deleted rows and constant server creation and deletion activity, a value of max_rows=1000 might exceed the database's configured maximum packet size or timeout due to a database deadlock, forcing the operator to use a much lower max_rows value like 100 or 50. And when the operator has e.g. 500,000 deleted instances rows (and millions of deleted rows total) they are trying to archive, being forced to use a max_rows value several orders of magnitude lower than the number of rows they need to archive is a poor user experience and also makes it unclear if archive progress is actually being made.	[Impact] Originally, Nova archives deleted rows in batches consisting of a maximum number of parent rows (max_rows) plus their child rows, all within a single database transaction. This approach limits the maximum value of max_rows that can be specified by the caller due to the potential size of the database transaction it could generate. Additionally, this behavior can cause the cleanup process to frequently encounter the following error: oslo_db.exception.DBError: (pymysql.err.InternalError) (3100, "Error on observer while running replication hook 'before_commit'.") The error arises when the transaction exceeds the group replication transaction size limit, a safeguard implemented to prevent potential MySQL crashes [1]. The default value for this limit is approximately 143MB. [Fix] An upstream commit has changed the logic to archive one parent row and its related child rows in a single database transaction. This change allows operators to choose more predictable values for max_rows and achieve more progress with each invocation of archive_deleted_rows. Additionally, this commit reduces the chances of encountering the issue where the transaction size exceeds the group replication transaction size limit. commit 697fa3c000696da559e52b664c04cbd8d261c037 Author: melanie witt <melwittt@gmail.com> CommitDate: Tue Jun 20 20:04:46 2023 +0000 database: Archive parent and child rows "trees" one at a time [Test Plan] 1. Create an instance and delete it in OpenStack. 2. Log in to the Nova database and confirm that there is an entry with a deleted_at value that is not NULL. select display_name, deleted_at from instances where deleted_at <> 0; 3. Execute the following command, ensuring that the timestamp specified in --before is later than the deleted_at value: nova-manage db archive_deleted_rows --before "XXX-XX-XX XX:XX:XX" --verbose --until-complete 4. Log in to the Nova database again and confirm that the entry has been archived and removed. select display_name, deleted_at from instances where deleted_at <> 0; [Where problems could occur] The commit changes the logic for archiving deleted entries to reduce the size of transactions generated during the operation. If the patch contains errors, it will only impact the archiving of deleted entries and will not affect other functionalities. [1] https://bugs.mysql.com/bug.php?id=84785
2024-05-28 09:49:49	Chengen Du	attachment added		lp2024258-nova-focal.debdiff https://bugs.launchpad.net/ubuntu/+source/nova/+bug/2024258/+attachment/5783537/+files/lp2024258-nova-focal.debdiff
2024-05-28 09:50:49	Chengen Du	attachment added		lp2024258-nova-jammy.debdiff https://bugs.launchpad.net/ubuntu/+source/nova/+bug/2024258/+attachment/5783538/+files/lp2024258-nova-jammy.debdiff
2024-05-28 09:52:15	Chengen Du	nova (Ubuntu Focal): status	New	In Progress
2024-05-28 09:52:20	Chengen Du	nova (Ubuntu Jammy): status	New	In Progress
2024-05-28 09:52:26	Chengen Du	nova (Ubuntu Focal): assignee		Chengen Du (chengendu)
2024-05-28 09:52:31	Chengen Du	nova (Ubuntu Jammy): assignee		Chengen Du (chengendu)
2024-05-28 09:55:03	Chengen Du	bug			added subscriber Support Engineering Sponsors
2024-05-28 12:21:53	Ubuntu Foundations Team Bug Bot	tags	db performance	db patch performance
2024-05-28 12:22:00	Ubuntu Foundations Team Bug Bot	bug			added subscriber Ubuntu Sponsors
2024-07-01 16:36:09	Mauricio Faria de Oliveira	description	[Impact] Originally, Nova archives deleted rows in batches consisting of a maximum number of parent rows (max_rows) plus their child rows, all within a single database transaction. This approach limits the maximum value of max_rows that can be specified by the caller due to the potential size of the database transaction it could generate. Additionally, this behavior can cause the cleanup process to frequently encounter the following error: oslo_db.exception.DBError: (pymysql.err.InternalError) (3100, "Error on observer while running replication hook 'before_commit'.") The error arises when the transaction exceeds the group replication transaction size limit, a safeguard implemented to prevent potential MySQL crashes [1]. The default value for this limit is approximately 143MB. [Fix] An upstream commit has changed the logic to archive one parent row and its related child rows in a single database transaction. This change allows operators to choose more predictable values for max_rows and achieve more progress with each invocation of archive_deleted_rows. Additionally, this commit reduces the chances of encountering the issue where the transaction size exceeds the group replication transaction size limit. commit 697fa3c000696da559e52b664c04cbd8d261c037 Author: melanie witt <melwittt@gmail.com> CommitDate: Tue Jun 20 20:04:46 2023 +0000 database: Archive parent and child rows "trees" one at a time [Test Plan] 1. Create an instance and delete it in OpenStack. 2. Log in to the Nova database and confirm that there is an entry with a deleted_at value that is not NULL. select display_name, deleted_at from instances where deleted_at <> 0; 3. Execute the following command, ensuring that the timestamp specified in --before is later than the deleted_at value: nova-manage db archive_deleted_rows --before "XXX-XX-XX XX:XX:XX" --verbose --until-complete 4. Log in to the Nova database again and confirm that the entry has been archived and removed. select display_name, deleted_at from instances where deleted_at <> 0; [Where problems could occur] The commit changes the logic for archiving deleted entries to reduce the size of transactions generated during the operation. If the patch contains errors, it will only impact the archiving of deleted entries and will not affect other functionalities. [1] https://bugs.mysql.com/bug.php?id=84785	[Impact] Originally, Nova archives deleted rows in batches consisting of a maximum number of parent rows (max_rows) plus their child rows, all within a single database transaction. This approach limits the maximum value of max_rows that can be specified by the caller due to the potential size of the database transaction it could generate. Additionally, this behavior can cause the cleanup process to frequently encounter the following error: oslo_db.exception.DBError: (pymysql.err.InternalError) (3100, "Error on observer while running replication hook 'before_commit'.") The error arises when the transaction exceeds the group replication transaction size limit, a safeguard implemented to prevent potential MySQL crashes [1]. The default value for this limit is approximately 143MB. [Fix] An upstream commit has changed the logic to archive one parent row and its related child rows in a single database transaction. This change allows operators to choose more predictable values for max_rows and achieve more progress with each invocation of archive_deleted_rows. Additionally, this commit reduces the chances of encountering the issue where the transaction size exceeds the group replication transaction size limit. commit 697fa3c000696da559e52b664c04cbd8d261c037 Author: melanie witt <melwittt@gmail.com> CommitDate: Tue Jun 20 20:04:46 2023 +0000 database: Archive parent and child rows "trees" one at a time [Test Plan] 1. Create an instance and delete it in OpenStack. 2. Log in to the Nova database and confirm that there is an entry with a deleted_at value that is not NULL. select display_name, deleted_at from instances where deleted_at <> 0; 3. Execute the following command, ensuring that the timestamp specified in --before is later than the deleted_at value: nova-manage db archive_deleted_rows --before "XXX-XX-XX XX:XX:XX" --verbose --until-complete 4. Log in to the Nova database again and confirm that the entry has been archived and removed. select display_name, deleted_at from instances where deleted_at <> 0; [Where problems could occur] The commit changes the logic for archiving deleted entries to reduce the size of transactions generated during the operation. If the patch contains errors, it will only impact the archiving of deleted entries and will not affect other functionalities. [1] https://bugs.mysql.com/bug.php?id=84785 [Original Bug Description] Observed downstream in a large scale cluster with constant create/delete server activity and hundreds of thousands of deleted instances rows. Currently, we archive deleted rows in batches of max_rows parents + their child rows in a single database transaction. Doing it that way limits how high a value of max_rows can be specified by the caller because of the size of the database transaction it could generate. For example, in a large scale deployment with hundreds of thousands of deleted rows and constant server creation and deletion activity, a value of max_rows=1000 might exceed the database's configured maximum packet size or timeout due to a database deadlock, forcing the operator to use a much lower max_rows value like 100 or 50. And when the operator has e.g. 500,000 deleted instances rows (and millions of deleted rows total) they are trying to archive, being forced to use a max_rows value several orders of magnitude lower than the number of rows they need to archive is a poor user experience and makes it unclear if archive progress is actually being made.
2024-07-10 16:42:06	Steve Langasek	nova (Ubuntu): status	New	Incomplete
2024-07-12 02:48:28	Chengen Du	nova (Ubuntu): status	Incomplete	Won't Fix
2024-07-16 13:47:30	Lukas Märdian	removed subscriber Ubuntu Sponsors

Ubuntunova package

Activity log for bug #2024258

Ubuntu
nova package