Activity log for bug #2035361

Date Who What changed Old value New value Message
2023-09-13 13:19:28 Philip Cox bug added bug
2023-09-13 13:19:37 Philip Cox nominated for series Ubuntu Jammy
2023-09-13 13:19:37 Philip Cox bug task added linux-intel-iotg (Ubuntu Jammy)
2023-09-13 13:19:48 Philip Cox linux-intel-iotg (Ubuntu Jammy): assignee Philip Cox (philcox)
2023-09-13 13:19:54 Philip Cox linux-intel-iotg (Ubuntu Jammy): status New Confirmed
2023-09-13 13:19:57 Philip Cox linux-intel-iotg (Ubuntu): status New Confirmed
2023-09-13 13:20:01 Philip Cox linux-intel-iotg (Ubuntu Jammy): status Confirmed Triaged
2023-09-13 13:20:06 Philip Cox linux-intel-iotg (Ubuntu): status Confirmed Triaged
2023-09-13 13:20:09 Philip Cox linux-intel-iotg (Ubuntu Jammy): status Triaged In Progress
2023-09-13 13:20:14 Philip Cox linux-intel-iotg (Ubuntu): status Triaged In Progress
2023-09-13 13:22:48 Philip Cox description This is a public version of https://bugs.launchpad.net/bugs/2030480 [Summary] The command `phc2sys` is to sync the system clock and PTP hardware clock on one machine. The output from `phc2sys` command indicate some information including the `delay`. The `delay` should be 0 if the device is using `hardware cross-timestamping`. In realtime kernel, the `delay` is never 0. But it is 0 in generic kernel (which is correct). This issue also happened on EHL and TGL, which means it may be an all-platform issue caused by realtime kernel. output from realtime kernel --------------------------- root@u-Alder-Lake-Client-Platform:/home/u# phc2sys -s "$interface" -O 0 -c CLOCK_REALTIME --step_threshold=1 --transportSpecific=1 -w -m --first_step_threshold=0.0 phc2sys[1021.560]: CLOCK_REALTIME phc offset 3251119 s0 freq -85311 delay 3107 phc2sys[1022.560]: CLOCK_REALTIME phc offset 3255144 s2 freq -81287 delay 2936 phc2sys[1023.560]: CLOCK_REALTIME phc offset 3255167 s2 freq +3173880 delay 2907 phc2sys[1024.560]: CLOCK_REALTIME phc offset -10828 s2 freq +884435 delay 2977 phc2sys[1025.561]: CLOCK_REALTIME phc offset -977556 s2 freq -85541 delay 3385 phc2sys[1026.561]: CLOCK_REALTIME phc offset -973361 s2 freq -374613 delay 2921 phc2sys[1027.561]: CLOCK_REALTIME phc offset -680002 s2 freq -373262 delay 2988 phc2sys[1028.561]: CLOCK_REALTIME phc offset -387940 s2 freq -285201 delay 3392 phc2sys[1029.562]: CLOCK_REALTIME phc offset -183974 s2 freq -197617 delay 3361 phc2sys[1030.562]: CLOCK_REALTIME phc offset -67676 s2 freq -136511 delay 2933 phc2sys[1031.562]: CLOCK_REALTIME phc offset -12389 s2 freq -101527 delay 2976 phc2sys[1032.562]: CLOCK_REALTIME phc offset 7914 s2 freq -84940 delay 2973 phc2sys[1033.563]: CLOCK_REALTIME phc offset 11594 s2 freq -78886 delay 2975 phc2sys[1034.563]: CLOCK_REALTIME phc offset 9221 s2 freq -77781 delay 2931 phc2sys[1035.563]: CLOCK_REALTIME phc offset 5824 s2 freq -78412 delay 2996 phc2sys[1036.563]: CLOCK_REALTIME phc offset 2980 s2 freq -79508 delay 3184 phc2sys[1037.564]: CLOCK_REALTIME phc offset 1216 s2 freq -80378 delay 2926 phc2sys[1038.564]: CLOCK_REALTIME phc offset 400 s2 freq -80830 delay 3126 phc2sys[1039.564]: CLOCK_REALTIME phc offset -29 s2 freq -81139 delay 2985 phc2sys[1040.565]: CLOCK_REALTIME phc offset -133 s2 freq -81251 delay 2991 phc2sys[1041.565]: CLOCK_REALTIME phc offset -135 s2 freq -81293 delay 2941 phc2sys[1042.565]: CLOCK_REALTIME phc offset -40 s2 freq -81239 delay 2952 phc2sys[1043.565]: CLOCK_REALTIME phc offset -43 s2 freq -81254 delay 2975 phc2sys[1044.566]: CLOCK_REALTIME phc offset 43 s2 freq -81181 delay 3306 phc2sys[1045.566]: CLOCK_REALTIME phc offset -105 s2 freq -81316 delay 2970 phc2sys[1046.566]: CLOCK_REALTIME phc offset -4 s2 freq -81246 delay 2944 phc2sys[1047.567]: CLOCK_REALTIME phc offset -14 s2 freq -81257 delay 3098 phc2sys[1048.567]: CLOCK_REALTIME phc offset 28 s2 freq -81220 delay 2944 phc2sys[1049.567]: CLOCK_REALTIME phc offset 64 s2 freq -81175 delay 3181 phc2sys[1050.568]: CLOCK_REALTIME phc offset -27 s2 freq -81247 delay 3027 phc2sys[1051.568]: CLOCK_REALTIME phc offset -19 s2 freq -81247 delay 2930 output from generic kernel (5.15.0-1036-intel-iotg) --------------------------------------------------- root@u-Alder-Lake-Client-Platform:/home/u# phc2sys -s "$interface" -O 0 -c CLOCK_REALTIME --step_threshold=1 --transportSpecific=1 -w -m --first_step_threshold=0.0 phc2sys[378.423]: CLOCK_REALTIME phc offset 266 s0 freq -96482 delay 0 phc2sys[379.423]: CLOCK_REALTIME phc offset 403 s2 freq -96345 delay 0 phc2sys[380.423]: CLOCK_REALTIME phc offset 380 s2 freq -95965 delay 0 phc2sys[381.424]: CLOCK_REALTIME phc offset -3 s2 freq -96234 delay 0 phc2sys[382.424]: CLOCK_REALTIME phc offset -108 s2 freq -96340 delay 0 phc2sys[383.424]: CLOCK_REALTIME phc offset -131 s2 freq -96395 delay 0 phc2sys[384.424]: CLOCK_REALTIME phc offset -73 s2 freq -96377 delay 0 phc2sys[385.424]: CLOCK_REALTIME phc offset -46 s2 freq -96372 delay 0 phc2sys[386.425]: CLOCK_REALTIME phc offset -27 s2 freq -96366 delay 0 phc2sys[387.425]: CLOCK_REALTIME phc offset 1 s2 freq -96346 delay 0 phc2sys[388.425]: CLOCK_REALTIME phc offset -10 s2 freq -96357 delay 0 phc2sys[389.425]: CLOCK_REALTIME phc offset 7 s2 freq -96343 delay 0 phc2sys[390.425]: CLOCK_REALTIME phc offset -3 s2 freq -96351 delay 0 phc2sys[391.425]: CLOCK_REALTIME phc offset 12 s2 freq -96337 delay 0 phc2sys[392.426]: CLOCK_REALTIME phc offset 6 s2 freq -96339 delay 0 phc2sys[393.426]: CLOCK_REALTIME phc offset -9 s2 freq -96353 delay 0 phc2sys[394.426]: CLOCK_REALTIME phc offset -22 s2 freq -96368 delay 0 phc2sys[395.426]: CLOCK_REALTIME phc offset 11 s2 freq -96342 delay 0 phc2sys[396.426]: CLOCK_REALTIME phc offset 1 s2 freq -96349 delay 0 [Steps to reproduce] The below test steps is based on `616446_Linux_Ethernet_TSN_GSG_2.5` on Intel RDC (you can find it here[1]), this test is on page 28. And we do the TSN tests on `i225` ethernet controller. 1. Locate the ethernet interface is being used by `i225` via command `lshw -businfo -class network` 2. phc2sys -s "$interface" -O 0 -c CLOCK_REALTIME --step_threshold=1 --transportSpecific=1 -m --first_step_threshold=0.0 [Documents] [1] https://drive.google.com/file/d/1HsksCfLoUTz_baZU98omYuGBq-LDZKGe/view?usp=sharing [Expected result] `delay` equal to 0. [Actual result] `delay` is around 3000. [Failure rate] 10/10 [Additional information] CID: 202305-31591 SKU: ADL-N system-manufacturer: Intel Corporation system-product-name: Alder Lake Client Platform bios-version: RPLISFI1.R00.4081.A05.2305241419 CPU: 12th Gen Intel(R) Core(TM) i9-12900E (16x) GPU: 00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:4680] (rev 0c) kernel-version: 5.15.0-1035-intel-iot-realtime [Stage] Issue reported and logs collected right after it happened See original description ------------------------------------------------------------- After investigation, it was determined that the root cause was from patch that needs to be reverted. The patch needing to be reverted is in [jammy:linux-intel-iot-realtime] igc: Disable PTM sequences when interface goes down BugLink: https://bugs.launchpad.net/bugs/2019222 Kernel hangs or reboots reported in some boards with a combination of interface up/down or reset. It turns out that this occurs due to Foxville bus master disabling when PTM sequences remain enabled. We do not need to always enable PTM in the reset sequence as igc_ptp_reset is also called during interface down. This caused PTM sequences be enabled but Foxville tries to disable bus mastering before going through controller reset. This patch disables PCIe PTM when interface goes down. Signed-off-by: Aravindhan Gunasekaran <aravindhan.gunasekaran@intel.com> (back-ported from https://github.com/intel/linux-intel-quilt/tree/mainline-tracking-v5.19-linux-221019T120731Z/patches/0001-igc-Disable-PTM-sequences-when-interface-goes-down.tsn [context changes]) Signed-off-by: Philip Cox <philip.cox@canonical.com> Acked-by: Jian Hui Lee <jianhui.lee@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> This is a public version of https://bugs.launchpad.net/bugs/2030480 [Summary] The command `phc2sys` is to sync the system clock and PTP hardware clock on one machine. The output from `phc2sys` command indicate some information including the `delay`. The `delay` should be 0 if the device is using `hardware cross-timestamping`. In realtime kernel, the `delay` is never 0. But it is 0 in generic kernel (which is correct). This issue also happened on EHL and TGL, which means it may be an all-platform issue caused by realtime kernel. output from realtime kernel --------------------------- root@u-Alder-Lake-Client-Platform:/home/u# phc2sys -s "$interface" -O 0 -c CLOCK_REALTIME --step_threshold=1 --transportSpecific=1 -w -m --first_step_threshold=0.0 phc2sys[1021.560]: CLOCK_REALTIME phc offset 3251119 s0 freq -85311 delay 3107 phc2sys[1022.560]: CLOCK_REALTIME phc offset 3255144 s2 freq -81287 delay 2936 phc2sys[1023.560]: CLOCK_REALTIME phc offset 3255167 s2 freq +3173880 delay 2907 phc2sys[1024.560]: CLOCK_REALTIME phc offset -10828 s2 freq +884435 delay 2977 phc2sys[1025.561]: CLOCK_REALTIME phc offset -977556 s2 freq -85541 delay 3385 phc2sys[1026.561]: CLOCK_REALTIME phc offset -973361 s2 freq -374613 delay 2921 phc2sys[1027.561]: CLOCK_REALTIME phc offset -680002 s2 freq -373262 delay 2988 phc2sys[1028.561]: CLOCK_REALTIME phc offset -387940 s2 freq -285201 delay 3392 phc2sys[1029.562]: CLOCK_REALTIME phc offset -183974 s2 freq -197617 delay 3361 phc2sys[1030.562]: CLOCK_REALTIME phc offset -67676 s2 freq -136511 delay 2933 phc2sys[1031.562]: CLOCK_REALTIME phc offset -12389 s2 freq -101527 delay 2976 phc2sys[1032.562]: CLOCK_REALTIME phc offset 7914 s2 freq -84940 delay 2973 phc2sys[1033.563]: CLOCK_REALTIME phc offset 11594 s2 freq -78886 delay 2975 phc2sys[1034.563]: CLOCK_REALTIME phc offset 9221 s2 freq -77781 delay 2931 phc2sys[1035.563]: CLOCK_REALTIME phc offset 5824 s2 freq -78412 delay 2996 phc2sys[1036.563]: CLOCK_REALTIME phc offset 2980 s2 freq -79508 delay 3184 phc2sys[1037.564]: CLOCK_REALTIME phc offset 1216 s2 freq -80378 delay 2926 phc2sys[1038.564]: CLOCK_REALTIME phc offset 400 s2 freq -80830 delay 3126 phc2sys[1039.564]: CLOCK_REALTIME phc offset -29 s2 freq -81139 delay 2985 phc2sys[1040.565]: CLOCK_REALTIME phc offset -133 s2 freq -81251 delay 2991 phc2sys[1041.565]: CLOCK_REALTIME phc offset -135 s2 freq -81293 delay 2941 phc2sys[1042.565]: CLOCK_REALTIME phc offset -40 s2 freq -81239 delay 2952 phc2sys[1043.565]: CLOCK_REALTIME phc offset -43 s2 freq -81254 delay 2975 phc2sys[1044.566]: CLOCK_REALTIME phc offset 43 s2 freq -81181 delay 3306 phc2sys[1045.566]: CLOCK_REALTIME phc offset -105 s2 freq -81316 delay 2970 phc2sys[1046.566]: CLOCK_REALTIME phc offset -4 s2 freq -81246 delay 2944 phc2sys[1047.567]: CLOCK_REALTIME phc offset -14 s2 freq -81257 delay 3098 phc2sys[1048.567]: CLOCK_REALTIME phc offset 28 s2 freq -81220 delay 2944 phc2sys[1049.567]: CLOCK_REALTIME phc offset 64 s2 freq -81175 delay 3181 phc2sys[1050.568]: CLOCK_REALTIME phc offset -27 s2 freq -81247 delay 3027 phc2sys[1051.568]: CLOCK_REALTIME phc offset -19 s2 freq -81247 delay 2930 output from generic kernel (5.15.0-1036-intel-iotg) --------------------------------------------------- root@u-Alder-Lake-Client-Platform:/home/u# phc2sys -s "$interface" -O 0 -c CLOCK_REALTIME --step_threshold=1 --transportSpecific=1 -w -m --first_step_threshold=0.0 phc2sys[378.423]: CLOCK_REALTIME phc offset 266 s0 freq -96482 delay 0 phc2sys[379.423]: CLOCK_REALTIME phc offset 403 s2 freq -96345 delay 0 phc2sys[380.423]: CLOCK_REALTIME phc offset 380 s2 freq -95965 delay 0 phc2sys[381.424]: CLOCK_REALTIME phc offset -3 s2 freq -96234 delay 0 phc2sys[382.424]: CLOCK_REALTIME phc offset -108 s2 freq -96340 delay 0 phc2sys[383.424]: CLOCK_REALTIME phc offset -131 s2 freq -96395 delay 0 phc2sys[384.424]: CLOCK_REALTIME phc offset -73 s2 freq -96377 delay 0 phc2sys[385.424]: CLOCK_REALTIME phc offset -46 s2 freq -96372 delay 0 phc2sys[386.425]: CLOCK_REALTIME phc offset -27 s2 freq -96366 delay 0 phc2sys[387.425]: CLOCK_REALTIME phc offset 1 s2 freq -96346 delay 0 phc2sys[388.425]: CLOCK_REALTIME phc offset -10 s2 freq -96357 delay 0 phc2sys[389.425]: CLOCK_REALTIME phc offset 7 s2 freq -96343 delay 0 phc2sys[390.425]: CLOCK_REALTIME phc offset -3 s2 freq -96351 delay 0 phc2sys[391.425]: CLOCK_REALTIME phc offset 12 s2 freq -96337 delay 0 phc2sys[392.426]: CLOCK_REALTIME phc offset 6 s2 freq -96339 delay 0 phc2sys[393.426]: CLOCK_REALTIME phc offset -9 s2 freq -96353 delay 0 phc2sys[394.426]: CLOCK_REALTIME phc offset -22 s2 freq -96368 delay 0 phc2sys[395.426]: CLOCK_REALTIME phc offset 11 s2 freq -96342 delay 0 phc2sys[396.426]: CLOCK_REALTIME phc offset 1 s2 freq -96349 delay 0 [Steps to reproduce] The below test steps is based on `616446_Linux_Ethernet_TSN_GSG_2.5` on Intel RDC (you can find it here[1]), this test is on page 28. And we do the TSN tests on `i225` ethernet controller. 1. Locate the ethernet interface is being used by `i225` via command `lshw -businfo -class network` 2. phc2sys -s "$interface" -O 0 -c CLOCK_REALTIME --step_threshold=1 --transportSpecific=1 -m --first_step_threshold=0.0 [Expected result] `delay` equal to 0. [Actual result] `delay` is around 3000. [Failure rate] 10/10 [Additional information] CID: 202305-31591 SKU: ADL-N system-manufacturer: Intel Corporation system-product-name: Alder Lake Client Platform bios-version: RPLISFI1.R00.4081.A05.2305241419 CPU: 12th Gen Intel(R) Core(TM) i9-12900E (16x) GPU: 00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:4680] (rev 0c) kernel-version: 5.15.0-1035-intel-iot-realtime [Stage] Issue reported and logs collected right after it happened See original description ------------------------------------------------------------- After investigation, it was determined that the root cause was from patch that needs to be reverted. The patch needing to be reverted is in [jammy:linux-intel-iot-realtime] igc: Disable PTM sequences when interface goes down BugLink: https://bugs.launchpad.net/bugs/2019222 Kernel hangs or reboots reported in some boards with a combination of interface up/down or reset. It turns out that this occurs due to Foxville bus master disabling when PTM sequences remain enabled. We do not need to always enable PTM in the reset sequence as igc_ptp_reset is also called during interface down. This caused PTM sequences be enabled but Foxville tries to disable bus mastering before going through controller reset. This patch disables PCIe PTM when interface goes down. Signed-off-by: Aravindhan Gunasekaran <aravindhan.gunasekaran@intel.com> (back-ported from https://github.com/intel/linux-intel-quilt/tree/mainline-tracking-v5.19-linux-221019T120731Z/patches/0001-igc-Disable-PTM-sequences-when-interface-goes-down.tsn [context changes]) Signed-off-by: Philip Cox <philip.cox@canonical.com> Acked-by: Jian Hui Lee <jianhui.lee@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com>
2023-09-18 19:54:39 Philip Cox linux-intel-iotg (Ubuntu Jammy): status In Progress Fix Committed
2023-09-18 19:54:42 Philip Cox linux-intel-iotg (Ubuntu): status In Progress Fix Committed