Activity log for bug #2042568

Date Who What changed Old value New value Message
2023-11-02 14:45:19 Ioanna Alifieraki bug added bug
2023-11-02 14:45:28 Ioanna Alifieraki linux (Ubuntu): status New Confirmed
2023-11-02 14:45:31 Ioanna Alifieraki linux (Ubuntu): importance Undecided Medium
2023-11-02 14:45:46 Ioanna Alifieraki nominated for series Ubuntu Lunar
2023-11-02 14:45:46 Ioanna Alifieraki bug task added linux (Ubuntu Lunar)
2023-11-02 14:45:46 Ioanna Alifieraki nominated for series Ubuntu Jammy
2023-11-02 14:45:46 Ioanna Alifieraki bug task added linux (Ubuntu Jammy)
2023-11-02 14:45:54 Ioanna Alifieraki linux (Ubuntu Jammy): assignee Ioanna Alifieraki (joalif)
2023-11-02 14:46:00 Ioanna Alifieraki linux (Ubuntu Lunar): assignee Ioanna Alifieraki (joalif)
2023-11-02 14:46:02 Ioanna Alifieraki linux (Ubuntu Jammy): importance Undecided Medium
2023-11-02 14:46:05 Ioanna Alifieraki linux (Ubuntu Lunar): importance Undecided Medium
2023-11-02 14:48:35 Ioanna Alifieraki affects linux (Ubuntu) linux-azure (Ubuntu)
2023-11-02 14:48:47 Ioanna Alifieraki linux-azure (Ubuntu): status Confirmed New
2023-11-02 15:05:06 Ioanna Alifieraki description [Description] [Test Case] [Where things could go wrong] [Other] [Description] On a VM on Azure with a Tesla gpu it was noticed that when removing the gpu from the pci the vm would crash. In case the nvidia drivers are loaded, the machine won't crash. Instead the removing process will hang and the machine will crash on reboot. This is related to bug [1]. The bug reported in [1] regards another driver but the root cause is the same. It is still investigated whether this is a bug in pci, or it is a bug of various drivers on how they use pci. For this case we have identified that removing commit [2] prevents the kernel crashes. Azure has requested to revert this commit, at least for the time being. This commit is not in upstream, so it just need to be reverted from Ubuntu kernels. [Test Case] On an Azure vm with a gpu : # echo '1' > /sys/bus/pci/devices/0001:00:00.0/remove where '0001:00:00.0' the pci address of the gpu. The vm will crash. [Where things could go wrong] [Other] [1] https://bugzilla.kernel.org/show_bug.cgi?id=215515 [2] https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/jammy/commit/?h=Ubuntu-azure-5.15.0-1043.50&id=75af0c10b3703400890d314d1d91d25294234a81
2023-11-02 15:26:45 Ioanna Alifieraki description [Description] On a VM on Azure with a Tesla gpu it was noticed that when removing the gpu from the pci the vm would crash. In case the nvidia drivers are loaded, the machine won't crash. Instead the removing process will hang and the machine will crash on reboot. This is related to bug [1]. The bug reported in [1] regards another driver but the root cause is the same. It is still investigated whether this is a bug in pci, or it is a bug of various drivers on how they use pci. For this case we have identified that removing commit [2] prevents the kernel crashes. Azure has requested to revert this commit, at least for the time being. This commit is not in upstream, so it just need to be reverted from Ubuntu kernels. [Test Case] On an Azure vm with a gpu : # echo '1' > /sys/bus/pci/devices/0001:00:00.0/remove where '0001:00:00.0' the pci address of the gpu. The vm will crash. [Where things could go wrong] [Other] [1] https://bugzilla.kernel.org/show_bug.cgi?id=215515 [2] https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/jammy/commit/?h=Ubuntu-azure-5.15.0-1043.50&id=75af0c10b3703400890d314d1d91d25294234a81 [Description] On a VM on Azure with a Tesla gpu it was noticed that when removing the gpu from the pci the vm would crash. In case the nvidia drivers are loaded, the machine won't crash. Instead the removing process will hang and the machine will crash on reboot. This is related to bug [1]. The bug reported in [1] regards another driver but the root cause is the same. It is still investigated whether this is a bug in pci, or it is a bug of various drivers on how they use pci. For this case we have identified that removing commit [2] prevents the kernel crashes. Azure has requested to revert this commit, at least for the time being. This commit is not in upstream, so it just need to be reverted from Ubuntu kernels. [Test Case] On an Azure vm with a gpu : # echo '1' > /sys/bus/pci/devices/0001:00:00.0/remove where '0001:00:00.0' the pci address of the gpu. The vm will crash. [Where things could go wrong] The commit to be reverted was included in a patchset to address lp bugs https://bugs.launchpad.net/bugs/2023071 and https://bugs.launchpad.net/bugs/2023594 However this commit just reduces boot time and removing shall not introduce any regressions. Side effects will be increase in the boot time. [Other] Only Ubuntu azure kernels are affected : - Jammy 5.15 - Lunar 6.2 Focal is also affected since it's using 5.15 kernel. This commit does not appear in Mantic 6.5 kernel. [1] https://bugzilla.kernel.org/show_bug.cgi?id=215515 [2] https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/jammy/commit/?h=Ubuntu-azure-5.15.0-1043.50&id=75af0c10b3703400890d314d1d91d25294234a81
2023-11-02 16:40:00 Ioanna Alifieraki linux-azure (Ubuntu Lunar): status New Invalid
2023-11-02 16:40:51 Ioanna Alifieraki description [Description] On a VM on Azure with a Tesla gpu it was noticed that when removing the gpu from the pci the vm would crash. In case the nvidia drivers are loaded, the machine won't crash. Instead the removing process will hang and the machine will crash on reboot. This is related to bug [1]. The bug reported in [1] regards another driver but the root cause is the same. It is still investigated whether this is a bug in pci, or it is a bug of various drivers on how they use pci. For this case we have identified that removing commit [2] prevents the kernel crashes. Azure has requested to revert this commit, at least for the time being. This commit is not in upstream, so it just need to be reverted from Ubuntu kernels. [Test Case] On an Azure vm with a gpu : # echo '1' > /sys/bus/pci/devices/0001:00:00.0/remove where '0001:00:00.0' the pci address of the gpu. The vm will crash. [Where things could go wrong] The commit to be reverted was included in a patchset to address lp bugs https://bugs.launchpad.net/bugs/2023071 and https://bugs.launchpad.net/bugs/2023594 However this commit just reduces boot time and removing shall not introduce any regressions. Side effects will be increase in the boot time. [Other] Only Ubuntu azure kernels are affected : - Jammy 5.15 - Lunar 6.2 Focal is also affected since it's using 5.15 kernel. This commit does not appear in Mantic 6.5 kernel. [1] https://bugzilla.kernel.org/show_bug.cgi?id=215515 [2] https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/jammy/commit/?h=Ubuntu-azure-5.15.0-1043.50&id=75af0c10b3703400890d314d1d91d25294234a81 [Description] On a VM on Azure with a Tesla gpu it was noticed that when removing the gpu from the pci the vm would crash. In case the nvidia drivers are loaded, the machine won't crash. Instead the removing process will hang and the machine will crash on reboot. This is related to bug [1]. The bug reported in [1] regards another driver but the root cause is the same. It is still investigated whether this is a bug in pci, or it is a bug of various drivers on how they use pci. For this case we have identified that removing commit [2] prevents the kernel crashes. Azure has requested to revert this commit, at least for the time being. This commit is not in upstream, so it just need to be reverted from Ubuntu kernels. [Test Case] On an Azure vm with a gpu : # echo '1' > /sys/bus/pci/devices/0001:00:00.0/remove where '0001:00:00.0' the pci address of the gpu. The vm will crash. [Where things could go wrong] The commit to be reverted was included in a patchset to address lp bugs https://bugs.launchpad.net/bugs/2023071 and https://bugs.launchpad.net/bugs/2023594 However this commit just reduces boot time and removing shall not introduce any regressions. Side effects will be increase in the boot time. [Other] Only Ubuntu azure kernels are affected : - Jammy 5.15 Focal is also affected since it's using 5.15 kernel. This commit does not appear in Mantic 6.5 kernel. [1] https://bugzilla.kernel.org/show_bug.cgi?id=215515 [2] https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/jammy/commit/?h=Ubuntu-azure-5.15.0-1043.50&id=75af0c10b3703400890d314d1d91d25294234a81
2023-11-17 12:14:35 Ioanna Alifieraki linux-azure (Ubuntu Lunar): status Invalid Confirmed
2023-11-17 12:14:39 Ioanna Alifieraki linux-azure (Ubuntu Jammy): status New Confirmed
2023-11-17 12:14:59 Ioanna Alifieraki description [Description] On a VM on Azure with a Tesla gpu it was noticed that when removing the gpu from the pci the vm would crash. In case the nvidia drivers are loaded, the machine won't crash. Instead the removing process will hang and the machine will crash on reboot. This is related to bug [1]. The bug reported in [1] regards another driver but the root cause is the same. It is still investigated whether this is a bug in pci, or it is a bug of various drivers on how they use pci. For this case we have identified that removing commit [2] prevents the kernel crashes. Azure has requested to revert this commit, at least for the time being. This commit is not in upstream, so it just need to be reverted from Ubuntu kernels. [Test Case] On an Azure vm with a gpu : # echo '1' > /sys/bus/pci/devices/0001:00:00.0/remove where '0001:00:00.0' the pci address of the gpu. The vm will crash. [Where things could go wrong] The commit to be reverted was included in a patchset to address lp bugs https://bugs.launchpad.net/bugs/2023071 and https://bugs.launchpad.net/bugs/2023594 However this commit just reduces boot time and removing shall not introduce any regressions. Side effects will be increase in the boot time. [Other] Only Ubuntu azure kernels are affected : - Jammy 5.15 Focal is also affected since it's using 5.15 kernel. This commit does not appear in Mantic 6.5 kernel. [1] https://bugzilla.kernel.org/show_bug.cgi?id=215515 [2] https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/jammy/commit/?h=Ubuntu-azure-5.15.0-1043.50&id=75af0c10b3703400890d314d1d91d25294234a81 [Description] On a VM on Azure with a Tesla gpu it was noticed that when removing the gpu from the pci the vm would crash. In case the nvidia drivers are loaded, the machine won't crash. Instead the removing process will hang and the machine will crash on reboot. This is related to bug [1]. The bug reported in [1] regards another driver but the root cause is the same. It is still investigated whether this is a bug in pci, or it is a bug of various drivers on how they use pci. For this case we have identified that removing commit [2] prevents the kernel crashes. Azure has requested to revert this commit, at least for the time being. This commit is not in upstream, so it just need to be reverted from Ubuntu kernels. [Test Case] On an Azure vm with a gpu : # echo '1' > /sys/bus/pci/devices/0001:00:00.0/remove where '0001:00:00.0' the pci address of the gpu. The vm will crash. [Where things could go wrong] The commit to be reverted was included in a patchset to address lp bugs https://bugs.launchpad.net/bugs/2023071 and https://bugs.launchpad.net/bugs/2023594 However this commit just reduces boot time and removing shall not introduce any regressions. Side effects will be increase in the boot time. [Other] Only Ubuntu azure kernels are affected : - Jammy 5.15 - Lunar 6.2 Focal is also affected since it's using 5.15 kernel. This commit does not appear in Mantic 6.5 kernel. [1] https://bugzilla.kernel.org/show_bug.cgi?id=215515 [2] https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/jammy/commit/?h=Ubuntu-azure-5.15.0-1043.50&id=75af0c10b3703400890d314d1d91d25294234a81
2023-12-07 19:37:20 Tim Gardner linux-azure (Ubuntu Jammy): status Confirmed Fix Committed
2023-12-07 19:37:24 Tim Gardner linux-azure (Ubuntu Lunar): status Confirmed Fix Committed
2023-12-07 19:37:36 Tim Gardner linux-azure (Ubuntu): status New Invalid
2024-01-18 02:39:12 Ubuntu Kernel Bot tags kernel-spammed-jammy-linux-azure-v2 verification-needed-jammy-linux-azure
2024-01-24 15:33:30 Ioanna Alifieraki tags kernel-spammed-jammy-linux-azure-v2 verification-needed-jammy-linux-azure kernel-spammed-jammy-linux-azure-v2 verification-done-jammy-linux-azure
2024-02-15 18:01:26 Launchpad Janitor linux-azure (Ubuntu Jammy): status Fix Committed Fix Released
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-32252
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-46813
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6039
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6040
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6111
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6176
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6606
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6622
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6817
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6931
2024-02-15 18:01:26 Launchpad Janitor cve linked 2023-6932
2024-02-15 18:01:26 Launchpad Janitor cve linked 2024-0193
2024-02-29 00:42:06 Ubuntu Kernel Bot tags kernel-spammed-jammy-linux-azure-v2 verification-done-jammy-linux-azure kernel-spammed-jammy-linux-azure-fips-v2 kernel-spammed-jammy-linux-azure-v2 verification-done-jammy-linux-azure verification-needed-jammy-linux-azure-fips