ss_get_by_ssh.php does not timeout commands that hang

Bug #1160611 reported by Mike Benshoof
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Monitoring Plugins
Fix Released
High
Unassigned

Bug Description

When trying to fetch the remote command via ssh, there is a chance that the remote command may hang (df was the particular issue in this case).

This in turn causes the poller to back up and prevent other data sources from being populated and fills the cacti logs with warnings such as:

03/26/2013 03:00:01 PM - POLLER: Poller[0] WARNING: There are '1' detected as overrunning a polling process, please investigate

It would be helpful to have the ability to timeout the actual command as well as the base ssh call (with ConnectTimeout).

Tags: cacti
Revision history for this message
Mike Benshoof (mbenshoof) wrote :
Changed in percona-monitoring-plugins:
importance: Undecided → High
milestone: none → 1.0.3
status: New → Incomplete
status: Incomplete → In Progress
Revision history for this message
Roman Vynar (roman-vynar) wrote :

Mike, thank you for the patch!

tags: added: cacti
summary: - ss_get_by_ssh.php doesn't timeout remote commands that hang
+ ss_get_by_ssh.php does not timeout commands that hang
Revision history for this message
Roman Vynar (roman-vynar) wrote :

Implementing two timeouts:
- $cmd_tout=10 - the timeout of ssh command itself or local cmd in case use_ssh is off;
- an option to enable SSH remote command timeout by prepending 'timeout $cmd_tout' to the actual command, i.e. `ssh HOST timeout 10 CMD`. Disabled by default because the timeout command can be missed on EL5 boxes (not a part of coreutils).

Why the second is important: when you run
ssh 192.168.8.1 sleep 9999
and then interrupts it, the ssh command itself stops but sleep 9999 will still be running on the target node.

Changed in percona-monitoring-plugins:
status: In Progress → Fix Committed
Changed in percona-monitoring-plugins:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.