When the wrong credential exists in cinder.conf for the cinder driver storwize_svc, The cinder volume keeps starting child process again and again, and just records the exception in log, But no meaningful error reports to the user.
------ps result:
# ps -ef | grep cinder-
cinder 6873 1 0 02:30 ? 00:00:00 /usr/bin/python /usr/bin/cinder-api --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/api.log
cinder 6888 1 0 02:30 ? 00:00:08 /usr/bin/python /usr/bin/cinder-scheduler --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/scheduler.log
cinder 9216 1 4 03:43 ? 00:03:43 /usr/bin/python /usr/bin/cinder-volume --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/volume.log
cinder 11320 9216 7 05:02 ? 00:00:00 /usr/bin/python /usr/bin/cinder-volume --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/volume.log
root 11322 3296 0 05:02 pts/2 00:00:00 grep cinder-
# ps -ef | grep cinder-
cinder 6873 1 0 02:30 ? 00:00:00 /usr/bin/python /usr/bin/cinder-api --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/api.log
cinder 6888 1 0 02:30 ? 00:00:08 /usr/bin/python /usr/bin/cinder-scheduler --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/scheduler.log
cinder 9216 1 4 03:43 ? 00:03:43 /usr/bin/python /usr/bin/cinder-volume --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/volume.log
cinder 11324 9216 4 05:03 ? 00:00:00 /usr/bin/python /usr/bin/cinder-volume --config-file /etc/cinder/cinder.conf --logfile /var/log/cinder/volume.log
root 11326 3296 0 05:03 pts/2 00:00:00 grep cinder-
#
-------------------------------------log for starting child process again and again
2013-04-11 03:18:56 INFO [cinder.service] Starting 1 workers
2013-04-11 03:18:56 INFO [cinder.service] Started child 8510
2013-04-11 03:18:56 AUDIT [cinder.service] Starting cinder-volume node (version 2013.1)
2013-04-11 03:18:56 DEBUG [cinder.volume.drivers.storwize_svc] enter: do_setup
2013-04-11 03:18:56 DEBUG [paramiko.transport] starting thread (client mode): 0x34fdfc10L
2013-04-11 03:18:56 INFO [paramiko.transport] Connected (version 2.0, client OpenSSH_5.1)
2013-04-11 03:18:56 DEBUG [paramiko.transport] kex algos:['diffie-hellman-group-exchange-sha256', 'diffie-hellman-group-exchange-sha1', 'diffie-hellman-group14-sha1', 'diffie-hellman-group1-sha1'] server key:['ssh-rsa', 'ssh-dss'] client encrypt:['aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'arcfour128', 'arcfour256', 'arcfour', 'aes192-cbc', 'aes256-cbc', '<email address hidden>', 'aes128-ctr', 'aes192-ctr', 'aes256-ctr'] server encrypt:['aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'arcfour128', 'arcfour256', 'arcfour', 'aes192-cbc', 'aes256-cbc', '<email address hidden>', 'aes128-ctr', 'aes192-ctr', 'aes256-ctr'] client mac:['hmac-md5', 'hmac-sha1', '<email address hidden>', 'hmac-ripemd160', '<email address hidden>', 'hmac-sha1-96', 'hmac-md5-96'] server mac:['hmac-md5', 'hmac-sha1', '<email address hidden>', 'hmac-ripemd160', '<email address hidden>', 'hmac-sha1-96', 'hmac-md5-96'] client compress:['none', '<email address hidden>'] server compress:['none', '<email address hidden>'] client lang:[''] server lang:[''] kex follows?False
2013-04-11 03:18:56 DEBUG [paramiko.transport] Ciphers agreed: local=aes128-ctr, remote=aes128-ctr
2013-04-11 03:18:56 DEBUG [paramiko.transport] using kex diffie-hellman-group1-sha1; server key type ssh-rsa; cipher: local aes128-ctr, remote aes128-ctr; mac: local hmac-sha1, remote hmac-sha1; compression: local none, remote none
2013-04-11 03:18:57 DEBUG [paramiko.transport] Switch to new keys ...
2013-04-11 03:18:57 DEBUG [paramiko.transport] Adding ssh-rsa host key for olyv7000.rch.stglabs.ibm.com: b3dcf0402edb424ba041164ac8874352
2013-04-11 03:18:57 DEBUG [paramiko.transport] Trying SSH key c8788e03d27d9ab7523d8f8608e43881
2013-04-11 03:18:57 DEBUG [paramiko.transport] userauth is OK
2013-04-11 03:18:57 DEBUG [paramiko.transport] Debug msg: Adding to environment: SSH_LABEL_ID=5
2013-04-11 03:18:58 DEBUG [paramiko.transport] Debug msg: Adding to environment: SSH_EPOCH=3
2013-04-11 03:18:58 INFO [paramiko.transport] Authentication (publickey) successful!
2013-04-11 03:18:58 DEBUG [cinder.utils] Running cmd (SSH): lsmdiskgrp -delim ! -nohdr
2013-04-11 03:18:58 DEBUG [paramiko.transport] [chan 1] Max packet in: 34816 bytes
2013-04-11 03:18:58 DEBUG [paramiko.transport] [chan 1] Max packet out: 32768 bytes
2013-04-11 03:18:58 INFO [paramiko.transport] Secsh channel 1 opened.
2013-04-11 03:18:58 DEBUG [paramiko.transport] [chan 1] Sesch channel 1 request ok
2013-04-11 03:18:58 DEBUG [paramiko.transport] [chan 1] EOF received (1)
2013-04-11 03:18:58 DEBUG [paramiko.transport] [chan 1] Unhandled channel request "<email address hidden>"
2013-04-11 03:18:58 DEBUG [paramiko.transport] [chan 1] EOF sent (1)
2013-04-11 03:18:58 DEBUG [cinder.utils] Result was 1
2013-04-11 03:18:58 ERROR [cinder.volume.drivers.san.san] Unexpected error while running command.
Command: lsmdiskgrp -delim ! -nohdr
Exit code: 1
Stdout: 'CMMVC7016E Authorization has failed because the private key is not valid for the user name that you have specified.\n\n'
Stderr: ''
2013-04-11 03:18:59 ERROR [cinder.volume.drivers.san.san] Error running SSH command: lsmdiskgrp -delim ! -nohdr
2013-04-11 03:18:59 ERROR [cinder.service] Unhandled exception
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 227, in _start_child
self._child_process(wrap.server)
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 204, in _child_process
launcher.run_server(server)
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 98, in run_server
server.start()
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 345, in start
self.manager.init_host()
File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 145, in init_host
self.driver.do_setup(ctxt)
File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/storwize_svc.py", line 190, in do_setup
out, err = self._run_ssh(ssh_cmd)
File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/san/san.py", line 152, in _run_ssh
raise e
ProcessExecutionError: Unexpected error while running command.
Command: lsmdiskgrp -delim ! -nohdr
Exit code: 1
Stdout: 'CMMVC7016E Authorization has failed because the private key is not valid for the user name that you have specified.\n\n'
Stderr: ''
2013-04-11 03:18:59 INFO [cinder.service] Child 8510 exited with status 2
2013-04-11 03:18:59 INFO [cinder.service] Started child 8512
2013-04-11 03:18:59 AUDIT [cinder.service] Starting cinder-volume node (version 2013.1)
2013-04-11 03:18:59 DEBUG [cinder.volume.drivers.storwize_svc] enter: do_setup
2013-04-11 03:18:59 DEBUG [paramiko.transport] starting thread (client mode): 0x34fdfd10L
2013-04-11 03:19:00 INFO [paramiko.transport] Connected (version 2.0, client OpenSSH_5.1)
2013-04-11 03:19:00 DEBUG [paramiko.transport] kex algos:['diffie-hellman-group-exchange-sha256', 'diffie-hellman-group-exchange-sha1', 'diffie-hellman-group14-sha1', 'diffie-hellman-group1-sha1'] server key:['ssh-rsa', 'ssh-dss'] client encrypt:['aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'arcfour128', 'arcfour256', 'arcfour', 'aes192-cbc', 'aes256-cbc', '<email address hidden>', 'aes128-ctr', 'aes192-ctr', 'aes256-ctr'] server encrypt:['aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'arcfour128', 'arcfour256', 'arcfour', 'aes192-cbc', 'aes256-cbc', '<email address hidden>', 'aes128-ctr', 'aes192-ctr', 'aes256-ctr'] client mac:['hmac-md5', 'hmac-sha1', '<email address hidden>', 'hmac-ripemd160', '<email address hidden>', 'hmac-sha1-96', 'hmac-md5-96'] server mac:['hmac-md5', 'hmac-sha1', '<email address hidden>', 'hmac-ripemd160', '<email address hidden>', 'hmac-sha1-96', 'hmac-md5-96'] client compress:['none', '<email address hidden>'] server compress:['none', '<email address hidden>'] client lang:[''] server lang:[''] kex follows?False
2013-04-11 03:19:00 DEBUG [paramiko.transport] Ciphers agreed: local=aes128-ctr, remote=aes128-ctr
2013-04-11 03:19:00 DEBUG [paramiko.transport] using kex diffie-hellman-group1-sha1; server key type ssh-rsa; cipher: local aes128-ctr, remote aes128-ctr; mac: local hmac-sha1, remote hmac-sha1; compression: local none, remote none
2013-04-11 03:19:00 DEBUG [paramiko.transport] Switch to new keys ...
2013-04-11 03:19:00 DEBUG [paramiko.transport] Adding ssh-rsa host key for olyv7000.rch.stglabs.ibm.com: b3dcf0402edb424ba041164ac8874352
2013-04-11 03:19:00 DEBUG [paramiko.transport] Trying SSH key c8788e03d27d9ab7523d8f8608e43881
2013-04-11 03:19:01 DEBUG [paramiko.transport] userauth is OK
2013-04-11 03:19:01 DEBUG [paramiko.transport] Debug msg: Adding to environment: SSH_LABEL_ID=5
2013-04-11 03:19:01 DEBUG [paramiko.transport] Debug msg: Adding to environment: SSH_EPOCH=3
2013-04-11 03:19:01 INFO [paramiko.transport] Authentication (publickey) successful!
2013-04-11 03:19:01 DEBUG [cinder.utils] Running cmd (SSH): lsmdiskgrp -delim ! -nohdr
2013-04-11 03:19:01 DEBUG [paramiko.transport] [chan 1] Max packet in: 34816 bytes
2013-04-11 03:19:02 DEBUG [paramiko.transport] [chan 1] Max packet out: 32768 bytes
2013-04-11 03:19:02 INFO [paramiko.transport] Secsh channel 1 opened.
2013-04-11 03:19:02 DEBUG [paramiko.transport] [chan 1] Sesch channel 1 request ok
2013-04-11 03:19:02 DEBUG [paramiko.transport] [chan 1] EOF received (1)
2013-04-11 03:19:02 DEBUG [paramiko.transport] [chan 1] Unhandled channel request "<email address hidden>"
2013-04-11 03:19:02 DEBUG [paramiko.transport] [chan 1] EOF sent (1)
2013-04-11 03:19:02 DEBUG [cinder.utils] Result was 1
2013-04-11 03:19:02 ERROR [cinder.volume.drivers.san.san] Unexpected error while running command.
Command: lsmdiskgrp -delim ! -nohdr
Exit code: 1
Stdout: 'CMMVC7016E Authorization has failed because the private key is not valid for the user name that you have specified.\n\n'
Stderr: ''
2013-04-11 03:19:03 ERROR [cinder.volume.drivers.san.san] Error running SSH command: lsmdiskgrp -delim ! -nohdr
2013-04-11 03:19:03 ERROR [cinder.service] Unhandled exception
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 227, in _start_child
self._child_process(wrap.server)
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 204, in _child_process
launcher.run_server(server)
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 98, in run_server
server.start()
File "/usr/lib/python2.6/site-packages/cinder/service.py", line 345, in start
self.manager.init_host()
File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 145, in init_host
self.driver.do_setup(ctxt)
File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/storwize_svc.py", line 190, in do_setup
out, err = self._run_ssh(ssh_cmd)
File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/san/san.py", line 152, in _run_ssh
raise e
ProcessExecutionError: Unexpected error while running command.
Command: lsmdiskgrp -delim ! -nohdr
Exit code: 1
Stdout: 'CMMVC7016E Authorization has failed because the private key is not valid for the user name that you have specified.\n\n'
Stderr: ''
2013-04-11 03:19:03 INFO [cinder.service] Child 8512 exited with status 2
2013-04-11 03:19:03 INFO [cinder.service] Started child 8513
the function wait() will try to _start_child again and again when "self.running and len(wrap.children) < wrap.workers" is true
and It seems as if it always is true.
cinder\ cinder\ service. py
eventlet. greenthread. sleep(. 01)
continue
def wait(self):
"""Loop waiting on children to die and respawning as necessary."""
while self.running:
wrap = self._wait_child()
if not wrap:
# Yield to other threads if no children have exited
# Sleep for a short time to avoid excessive CPU usage
# (see bug #1095346)
while self.running and len(wrap.children) < wrap.workers:
self. _start_ child(wrap)
if self.sigcaught:
signal. SIGINT: 'SIGINT' }[self. sigcaught]
LOG. info(_( 'Caught %s, stopping children'), signame)
signame = {signal.SIGTERM: 'SIGTERM',
for pid in self.children:
os.kill( pid, signal.SIGTERM)
raise
try:
except OSError as exc:
if exc.errno != errno.ESRCH:
# Wait for children to die
LOG. info(_( 'Waiting on %d children to exit'), len(self.children))
self. _wait_child( )
if self.children:
while self.children: