lshw 100%CPU usage every minute
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Mirantis OpenStack |
High
|
Stanislav Kolenkin | ||
| 8.0.x |
High
|
Stanislav Kolenkin | ||
| 9.x |
High
|
Stanislav Kolenkin |
Bug Description
I have found that lshw process is started every minute and uses 100% CPU on any cloud-related node.
date && ps axuwww |grep lshw
Wed Aug 17 09:54:07 AST 2016
root 25790 111 0.0 61924 41372 ? R 09:54 0:01 /usr/bin/lshw -json
root 25794 0.0 0.0 10432 928 pts/8 S+ 09:54 0:00 grep --color=auto lshw
date && ps axuwww |grep lshw
Wed Aug 17 09:54:26 AST 2016
root 25819 0.0 0.0 10432 924 pts/8 S+ 09:54 0:00 grep --color=auto lshw
date && ps axuwww |grep lshw
Wed Aug 17 09:55:27 AST 2016
root 26283 32.0 0.0 23084 8792 ? R 09:55 0:00 /usr/bin/lshw -json
root 26287 0.0 0.0 10432 928 pts/8 S+ 09:55 0:00 grep --color=auto lshw
lshw --version
Hardware Lister (lshw) - B.02.16
Screenshot in the attachment.
Stanislav Kolenkin (skolenkin) wrote : | #1 |
description: | updated |
Stanislav Kolenkin (skolenkin) wrote : | #5 |
The described issue can affect the response time of services, caused CPU usage spikes.
Oleksandr Savatieiev (osavatieiev) wrote : | #6 |
The lshw process works for 10-15 seconds on one (1) CPU and can affect response time when there will be high load on some service (services) that is working on the same host. For example, networking agent or contrail's cassandra DB that will definitely be affected.
This is not critical, I agree. But still, on high loaded clouds - this will become serious.
description: | updated |
tags: | added: customer-found |
tags: | added: support |
Maksym Shalamov (mshalamov) wrote : | #7 |
Hello,
The described issue affected only 1 CPU core, so can affect the response time of services only on the high loaded clusters. Probably exist some simple workaround to resolve this issue. Could somebody help resolve this issue?
Maksym Shalamov (mshalamov) wrote : | #8 |
Please pay attention that issue contains a customer-found tag.
Timur Nurlygayanov (tnurlygayanov) wrote : | #9 |
Why the issue in incomplete status?
Hi MOS Linux team, could you please take a look the issue?
Thank you!
Ivan Suzdal (isuzdal) wrote : | #11 |
Dear all
High CPU usage is absolutely normal behavior for lhsw.
AFAIU, lshw called from nailgun-agent. So, here is two options:
1) Gather necessary information using pure ruby code instead of calling lshw.
2) Set nailgun-agent priority with nice/ionice commands in cron task.
Yet another option - execute nailgun-agent on dedicated CPU (taskset can help).
I don't think it is normal for production to spend 100% of one CPU for nothing. Why do we need to scan hardware changes all the time. This information is needed for monitoring use cases only.
You may consult deploy engineers to run nailgun-agent before each deploy changes to get proper data for nodes, but no need to run this scan each minute.
Stanislav, please add more details on the topic. What is the impact? How does it affect workloads? Are there any failures?