Stress-ng

Overview
Code
Bugs
Blueprints
Translations
Answers

Bug #1640547
Comment #37

Comment 37 for bug 1640547

Revision history for this message

Colin Ian King (colin-king) wrote on 2016-12-07:

#37

I've made some modifications to the script (see attached), the changes include:

1. kill with ALRM first, then kill with KILL if this does not work after a small grace period. Also report on unkillable stressors
2. bump up async I/O threshold for machines with lots of CPUs
3. force hdd to do sync writes, that way we don't backlog with gazillions of pending I/Os on machines with a lot of memory and many CPUs
4. limit readahead file size so that this stressor does not spend most of it's time generating a test file before it can start testing readaheads

I've run this through several times with the latest stress-ng and it runs through to completion.

So I think we were suffering from issues where loads of pending I/Os from stressors plus bad cleanup on nuked stressors were causing massive I/O backlogs which caused the system to clag up.