When a preempt-rt system is stressed running pods on many isolated
cores (using stress-ng for instance [1]) the telegraf pod hangs after
a long period of time and becomes irresponsible.
A crash dump analysis indicates that telegraf goroutines stales
waiting for a kernel response to MSR read requests:
PID: 55638 TASK: ff466bf3a7ffde00 CPU: 0 COMMAND: "telegraf"
#0 [ff85787a26767c70] __schedule at ffffffffa30ae0c6
#1 [ff85787a26767d00] schedule at ffffffffa30ae7f7
#2 [ff85787a26767d18] schedule_timeout at ffffffffa30b15a4
#3 [ff85787a26767d70] wait_for_completion at ffffffffa30afbc4
#4 [ff85787a26767db8] rdmsr_safe_on_cpu at ffffffffa2afbda8
#5 [ff85787a26767e78] msr_read at ffffffffa2640e55
#6 [ff85787a26767ec8] vfs_read at ffffffffa2903208
#7 [ff85787a26767f00] __x64_sys_pread64 at ffffffffa2904ea1
#8 [ff85787a26767f40] do_syscall_64 at ffffffffa30a6b60
#9 [ff85787a26767f50] entry_SYSCALL_64_after_hwframe at ffffffffa3200099
This change includes a timeout (default value of 100ms) for MSR reads
and avoids telegraf to become irresponsible.
TEST PLAN (preempt-rt ISO):
PASS: Build custom telegraf image with this change
PASS: Override and apply power-metrics app with custom telegraf image
PASS: Launch stress pods and confirm the telegraf pod is still stable
after a long period of time.
Reviewed: https:/ /review. opendev. org/c/starlingx /app-power- metrics/ +/897839 /opendev. org/starlingx/ app-power- metrics/ commit/ 1078ecbb7b6f8d7 318e3e7c740d36d fdc5182985
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 1078ecbb7b6f8d7 318e3e7c740d36d fdc5182985
Author: Alyson Deives Pereira <email address hidden>
Date: Tue Oct 10 11:07:27 2023 -0300
telegraf: Add MSR read timeout
When a preempt-rt system is stressed running pods on many isolated
cores (using stress-ng for instance [1]) the telegraf pod hangs after
a long period of time and becomes irresponsible.
A crash dump analysis indicates that telegraf goroutines stales
waiting for a kernel response to MSR read requests:
PID: 55638 TASK: ff466bf3a7ffde00 CPU: 0 COMMAND: "telegraf" 64_after_ hwframe at ffffffffa3200099
#0 [ff85787a26767c70] __schedule at ffffffffa30ae0c6
#1 [ff85787a26767d00] schedule at ffffffffa30ae7f7
#2 [ff85787a26767d18] schedule_timeout at ffffffffa30b15a4
#3 [ff85787a26767d70] wait_for_completion at ffffffffa30afbc4
#4 [ff85787a26767db8] rdmsr_safe_on_cpu at ffffffffa2afbda8
#5 [ff85787a26767e78] msr_read at ffffffffa2640e55
#6 [ff85787a26767ec8] vfs_read at ffffffffa2903208
#7 [ff85787a26767f00] __x64_sys_pread64 at ffffffffa2904ea1
#8 [ff85787a26767f40] do_syscall_64 at ffffffffa30a6b60
#9 [ff85787a26767f50] entry_SYSCALL_
This change includes a timeout (default value of 100ms) for MSR reads
and avoids telegraf to become irresponsible.
[1] https:/ /github. com/ColinIanKin g/stress- ng
NOTE: This issue was reported on upstream: /github. com/influxdata/ telegraf/ issues/ 14088
https:/
Closes-Bug: 2038927
TEST PLAN (preempt-rt ISO):
PASS: Build custom telegraf image with this change
PASS: Override and apply power-metrics app with custom telegraf image
PASS: Launch stress pods and confirm the telegraf pod is still stable
after a long period of time.
Change-Id: I145da09f5a967e 219d0aa2e588d43 23e8a2eb1e0
Signed-off-by: Alyson Deives Pereira <email address hidden>