eventstat output suddenly accounts kernel threads as userspace processes after a prolonged amount of time
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
eventstat (Ubuntu) |
Fix Released
|
High
|
Colin Ian King |
Bug Description
I was running eventstat for about ten minutes (568 samples at a rate of one sample per second) on the Meizu MX4 Ubuntu phone (arale) and created a kernel/userspace distribution graph out of the "Total events" summary lines. On the graph it looked like the total number of events per second always repeated about the same pattern, but after 221 samples there was a huge spike and after that the distribution suddenly "flipped": A reduction in events accounted to the kernel was exactly matched by an increase in events attributed to userspace.
I found an explaination in the attached logfile. Let's first look at line 4748:
20.00 17594 [kworker/0:2] OSTimerWorkQueu
Then line 4765 below:
4440.00 17594 kworker/0:2 OSTimerWorkQueu
Notice that the event number is about 200 times the actual value (this timer runs at a fixed rate of 20 events/s) and the "Task" field no longer includes brackets, indicating that the entry is no longer counted as a kernel thread. The change in the "Total events" summary lines match this observation.
All lines after the spike then look like this:
20.00 17594 kworker/0:2 OSTimerWorkQueu
I suspect that the internal data structures get corrupted somehow after a prolonged amount of time.
Related branches
Changed in eventstat (Ubuntu): | |
assignee: | nobody → Colin Ian King (colin-king) |
Changed in eventstat (Ubuntu): | |
status: | In Progress → Fix Committed |
I think the internal cache is stale, I need to figure out how to check for this case when PIDs are re-used.