Activity log for bug #1852077

Date Who What changed Old value New value Message
2019-11-11 13:15:03 Aleksei bug added bug
2019-11-11 13:30:06 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-11-12 10:41:53 Po-Hsu Lin nominated for series Ubuntu Disco
2019-11-12 10:41:53 Po-Hsu Lin bug task added linux (Ubuntu Disco)
2019-11-12 10:41:53 Po-Hsu Lin nominated for series Ubuntu Bionic
2019-11-12 10:41:53 Po-Hsu Lin bug task added linux (Ubuntu Bionic)
2019-11-12 10:42:19 Po-Hsu Lin tags bionic disco
2019-11-12 11:06:03 Po-Hsu Lin nominated for series Ubuntu Eoan
2019-11-12 11:06:03 Po-Hsu Lin bug task added linux (Ubuntu Eoan)
2019-11-12 11:08:55 Po-Hsu Lin nominated for series Ubuntu Focal
2019-11-12 11:08:55 Po-Hsu Lin bug task added linux (Ubuntu Focal)
2019-11-12 11:09:03 Po-Hsu Lin tags bionic disco bionic disco eoan focal
2019-11-12 11:19:38 Po-Hsu Lin linux (Ubuntu Bionic): status New In Progress
2019-11-12 11:19:40 Po-Hsu Lin linux (Ubuntu Disco): status New In Progress
2019-11-12 11:19:43 Po-Hsu Lin linux (Ubuntu Eoan): status New In Progress
2019-11-12 11:19:45 Po-Hsu Lin linux (Ubuntu Focal): status Incomplete In Progress
2019-11-12 11:19:49 Po-Hsu Lin linux (Ubuntu Bionic): assignee Po-Hsu Lin (cypressyew)
2019-11-12 11:19:51 Po-Hsu Lin linux (Ubuntu Disco): assignee Po-Hsu Lin (cypressyew)
2019-11-12 11:19:53 Po-Hsu Lin linux (Ubuntu Eoan): assignee Po-Hsu Lin (cypressyew)
2019-11-12 11:19:54 Po-Hsu Lin linux (Ubuntu Focal): assignee Po-Hsu Lin (cypressyew)
2019-11-12 14:16:14 Po-Hsu Lin bug added subscriber Po-Hsu Lin
2019-11-13 07:31:20 Po-Hsu Lin description There's an issue with bonding driver in the current ubuntu kernels. Sometimes one link stuck in a weird state. It was fixed with patch https://www.spinics.net/lists/netdev/msg609506.html in upstream. Commit 1899bb325149e481de31a4f32b59ea6f24e176ea. We see this bug with linux 4.15 (ubuntu xenial, hwe kernel), but it should be reproducible with other current kernel versions. == Justification == From the well explained commit message: Since de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring"), the bonding driver has utilized two separate variables to indicate the next link state a particular slave should transition to. Each is used to communicate to a different portion of the link state change commit logic; one to the bond_miimon_commit function itself, and another to the state transition logic. Unfortunately, the two variables can become unsynchronized, resulting in incorrect link state transitions within bonding. This can cause slaves to become stuck in an incorrect link state until a subsequent carrier state transition. The issue occurs when a special case in bond_slave_netdev_event sets slave->link directly to BOND_LINK_FAIL. On the next pass through bond_miimon_inspect after the slave goes carrier up, the BOND_LINK_FAIL case will set the proposed next state (link_new_state) to BOND_LINK_UP, but the new_link to BOND_LINK_DOWN. The setting of the final link state from new_link comes after that from link_new_state, and so the slave will end up incorrectly in _DOWN state. Resolve this by combining the two variables into one. == Fixes == * 1899bb32 (bonding: fix state transition issue in link monitoring) This patch can be cherry-picked into E/F For older releases like B/D, it will needs to be backported as they are missing the slave_err() printk marco added in 5237ff79 (bonding: add slave_foo printk macros) as well as the commit to replace netdev_err() with slave_err() in e2a7420d (bonding/main: convert to using slave printk macros) For Xenial, the commit that causes this issue, de77ecd4, does not exist. == Test == Test kernels can be found here: https://people.canonical.com/~phlin/kernel/lp-1852077-bonding/ The X-hwe and Disco kernel were tested by the bug reporter, Aleksei, the patched kernel works as expected. == Regression Potential == Low. This patch just unifiy the variable used in link state change commit logic to prevent the occurance of an incorrect state. And the changes are limited to the bonding driver itself. (Although the include/net/bonding.h will be used in other drivers, but the changes to that file is only affecting this bond_main.c driver) == Original Bug Report == There's an issue with bonding driver in the current ubuntu kernels. Sometimes one link stuck in a weird state. It was fixed with patch https://www.spinics.net/lists/netdev/msg609506.html in upstream. Commit 1899bb325149e481de31a4f32b59ea6f24e176ea. We see this bug with linux 4.15 (ubuntu xenial, hwe kernel), but it should be reproducible with other current kernel versions.
2019-11-13 07:32:22 Po-Hsu Lin description == Justification == From the well explained commit message: Since de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring"), the bonding driver has utilized two separate variables to indicate the next link state a particular slave should transition to. Each is used to communicate to a different portion of the link state change commit logic; one to the bond_miimon_commit function itself, and another to the state transition logic. Unfortunately, the two variables can become unsynchronized, resulting in incorrect link state transitions within bonding. This can cause slaves to become stuck in an incorrect link state until a subsequent carrier state transition. The issue occurs when a special case in bond_slave_netdev_event sets slave->link directly to BOND_LINK_FAIL. On the next pass through bond_miimon_inspect after the slave goes carrier up, the BOND_LINK_FAIL case will set the proposed next state (link_new_state) to BOND_LINK_UP, but the new_link to BOND_LINK_DOWN. The setting of the final link state from new_link comes after that from link_new_state, and so the slave will end up incorrectly in _DOWN state. Resolve this by combining the two variables into one. == Fixes == * 1899bb32 (bonding: fix state transition issue in link monitoring) This patch can be cherry-picked into E/F For older releases like B/D, it will needs to be backported as they are missing the slave_err() printk marco added in 5237ff79 (bonding: add slave_foo printk macros) as well as the commit to replace netdev_err() with slave_err() in e2a7420d (bonding/main: convert to using slave printk macros) For Xenial, the commit that causes this issue, de77ecd4, does not exist. == Test == Test kernels can be found here: https://people.canonical.com/~phlin/kernel/lp-1852077-bonding/ The X-hwe and Disco kernel were tested by the bug reporter, Aleksei, the patched kernel works as expected. == Regression Potential == Low. This patch just unifiy the variable used in link state change commit logic to prevent the occurance of an incorrect state. And the changes are limited to the bonding driver itself. (Although the include/net/bonding.h will be used in other drivers, but the changes to that file is only affecting this bond_main.c driver) == Original Bug Report == There's an issue with bonding driver in the current ubuntu kernels. Sometimes one link stuck in a weird state. It was fixed with patch https://www.spinics.net/lists/netdev/msg609506.html in upstream. Commit 1899bb325149e481de31a4f32b59ea6f24e176ea. We see this bug with linux 4.15 (ubuntu xenial, hwe kernel), but it should be reproducible with other current kernel versions. == Justification == From the well explained commit message: Since de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring"), the bonding driver has utilized two separate variables to indicate the next link state a particular slave should transition to. Each is used to communicate to a different portion of the link state change commit logic; one to the bond_miimon_commit function itself, and another to the state transition logic.  Unfortunately, the two variables can become unsynchronized, resulting in incorrect link state transitions within bonding. This can cause slaves to become stuck in an incorrect link state until a subsequent carrier state transition.  The issue occurs when a special case in bond_slave_netdev_event sets slave->link directly to BOND_LINK_FAIL. On the next pass through bond_miimon_inspect after the slave goes carrier up, the BOND_LINK_FAIL case will set the proposed next state (link_new_state) to BOND_LINK_UP, but the new_link to BOND_LINK_DOWN. The setting of the final link state from new_link comes after that from link_new_state, and so the slave will end up incorrectly in _DOWN state.  Resolve this by combining the two variables into one. == Fixes == * 1899bb32 (bonding: fix state transition issue in link monitoring) This patch can be cherry-picked into E/F For older releases like B/D, it will needs to be backported as they are missing the slave_err() printk marco added in 5237ff79 (bonding: add slave_foo printk macros) as well as the commit to replace netdev_err() with slave_err() in e2a7420d (bonding/main: convert to using slave printk macros) For Xenial, the commit that causes this issue, de77ecd4, does not exist. == Test == Test kernels can be found here: https://people.canonical.com/~phlin/kernel/lp-1852077-bonding/ The X-hwe and Disco kernel were tested by the bug reporter, Aleksei, the patched kernel works as expected. == Regression Potential == Low. This patch just unify the variable used in link state change commit logic to prevent the occurrence of an incorrect state. And the changes are limited to the bonding driver itself. (Although the include/net/bonding.h will be used in other drivers, but the changes to that file is only affecting this bond_main.c driver) == Original Bug Report == There's an issue with bonding driver in the current ubuntu kernels. Sometimes one link stuck in a weird state. It was fixed with patch https://www.spinics.net/lists/netdev/msg609506.html in upstream. Commit 1899bb325149e481de31a4f32b59ea6f24e176ea. We see this bug with linux 4.15 (ubuntu xenial, hwe kernel), but it should be reproducible with other current kernel versions.
2019-11-21 19:02:14 Nivedita Singhvi tags bionic disco eoan focal bionic disco eoan focal sts
2019-11-28 10:59:45 Stefan Bader linux (Ubuntu Bionic): importance Undecided Medium
2019-11-28 10:59:47 Stefan Bader linux (Ubuntu Disco): importance Undecided Medium
2019-11-28 10:59:50 Stefan Bader linux (Ubuntu Eoan): importance Undecided Medium
2019-11-28 10:59:54 Stefan Bader linux (Ubuntu Focal): importance Undecided Medium
2019-11-28 11:18:49 Stefan Bader linux (Ubuntu Disco): status In Progress Fix Committed
2019-12-02 11:19:58 Kleber Sacilotto de Souza linux (Ubuntu Eoan): status In Progress Fix Committed
2019-12-02 11:20:02 Kleber Sacilotto de Souza linux (Ubuntu Bionic): status In Progress Fix Committed
2020-01-06 14:28:05 Fabio Augusto Miranda Martins bug added subscriber Fabio Augusto Miranda Martins
2020-01-07 04:31:47 William Grant bug added subscriber William Grant
2020-01-13 10:58:00 Po-Hsu Lin linux (Ubuntu Bionic): status Fix Committed Fix Released
2020-01-13 10:58:03 Po-Hsu Lin linux (Ubuntu Disco): status Fix Committed Fix Released
2020-01-13 10:58:04 Po-Hsu Lin linux (Ubuntu Eoan): status Fix Committed Fix Released
2020-01-13 10:58:06 Po-Hsu Lin linux (Ubuntu Focal): status In Progress Fix Released