Activity log for bug #1705543

Date Who What changed Old value New value Message
2017-07-20 17:35:09 Adam Spiers bug added bug
2017-07-20 17:44:28 OpenStack Infra barbican: status New In Progress
2017-07-20 17:52:23 Adam Spiers description If I change the queue.asynchronous_workers config option from 1 to 2, then if I start barbican-worker via systemd and stop it again, it hangs on shutdown: 2017-07-20 16:35:22.158 8435 INFO barbican.queue.server [-] Halting the TaskServer 2017-07-20 16:35:22.159 8436 INFO barbican.queue.server [-] Halting the TaskServer 2017-07-20 16:35:22.168 8256 INFO oslo_service.service [-] Caught SIGTERM, stopping children 2017-07-20 16:35:22.169 8256 DEBUG oslo_concurrency.lockutils [-] Acquired semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212 2017-07-20 16:35:22.169 8256 DEBUG oslo_concurrency.lockutils [-] Releasing semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:225 2017-07-20 16:35:22.170 8256 DEBUG oslo_service.service [-] Stop services. stop /usr/lib/python2.7/site-packages/oslo_service/service.py:611 2017-07-20 16:35:22.170 8256 INFO barbican.queue.server [-] Halting the TaskServer 2017-07-20 16:35:26.659 8436 DEBUG oslo_concurrency.lockutils [-] Acquired semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212 2017-07-20 16:35:26.660 8436 DEBUG oslo_concurrency.lockutils [-] Releasing semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:225 2017-07-20 16:35:26.671 8435 DEBUG oslo_concurrency.lockutils [-] Acquired semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212 2017-07-20 16:35:26.672 8435 DEBUG oslo_concurrency.lockutils [-] Releasing semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:225 2017-07-20 16:35:52.171 8256 WARNING oslo_messaging.server [-] Possible hang: stop is waiting for start to complete 2017-07-20 16:35:52.173 8256 DEBUG oslo_messaging.server [-] File "/usr/bin/barbican-worker", line 10, in <module> sys.exit(main()) File "/usr/lib/python2.7/site-packages/barbican/cmd/worker.py", line 68, in main workers=CONF.queue.asynchronous_workers File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 605, in wait self.stop() File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 614, in stop service.stop() File "/usr/lib/python2.7/site-packages/barbican/queue/server.py", line 290, in stop self._server.stop() File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 264, in wrapper log_after, timeout_timer) File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 163, in wait_for_completion msg, log_after, timeout_timer) File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 128, in _wait LOG.debug(''.join(traceback.format_stack())) _wait /usr/lib/python2.7/site-packages/oslo_messaging/server.py:128 I'm very far from being an oslo.messaging expert, but this *appears* to be the same issue which Sahara had, namely that the RPC server needs to be started before you can safely call wait() on it: https://bugs.launchpad.net/sahara/+bug/1546119 I've ported the fix over from Sahara and it seems to fix the issue so I'll submit to gerrit shortly. If I change the queue.asynchronous_workers config option from 1 to 2, then if I start barbican-worker via systemd and stop it again, it hangs on shutdown:     2017-07-20 16:35:22.158 8435 INFO barbican.queue.server [-] Halting the TaskServer     2017-07-20 16:35:22.159 8436 INFO barbican.queue.server [-] Halting the TaskServer     2017-07-20 16:35:22.168 8256 INFO oslo_service.service [-] Caught SIGTERM, stopping children     2017-07-20 16:35:22.169 8256 DEBUG oslo_concurrency.lockutils [-] Acquired semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212     2017-07-20 16:35:22.169 8256 DEBUG oslo_concurrency.lockutils [-] Releasing semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:225     2017-07-20 16:35:22.170 8256 DEBUG oslo_service.service [-] Stop services. stop /usr/lib/python2.7/site-packages/oslo_service/service.py:611     2017-07-20 16:35:22.170 8256 INFO barbican.queue.server [-] Halting the TaskServer     2017-07-20 16:35:26.659 8436 DEBUG oslo_concurrency.lockutils [-] Acquired semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212     2017-07-20 16:35:26.660 8436 DEBUG oslo_concurrency.lockutils [-] Releasing semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:225     2017-07-20 16:35:26.671 8435 DEBUG oslo_concurrency.lockutils [-] Acquired semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212     2017-07-20 16:35:26.672 8435 DEBUG oslo_concurrency.lockutils [-] Releasing semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:225     2017-07-20 16:35:52.171 8256 WARNING oslo_messaging.server [-] Possible hang: stop is waiting for start to complete     2017-07-20 16:35:52.173 8256 DEBUG oslo_messaging.server [-] File "/usr/bin/barbican-worker", line 10, in <module>         sys.exit(main())       File "/usr/lib/python2.7/site-packages/barbican/cmd/worker.py", line 68, in main         workers=CONF.queue.asynchronous_workers       File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 605, in wait         self.stop()       File "/usr/lib/python2.7/site-packages/oslo_service/service.py", line 614, in stop         service.stop()       File "/usr/lib/python2.7/site-packages/barbican/queue/server.py", line 290, in stop         self._server.stop()       File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 264, in wrapper         log_after, timeout_timer)       File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 163, in wait_for_completion         msg, log_after, timeout_timer)       File "/usr/lib/python2.7/site-packages/oslo_messaging/server.py", line 128, in _wait         LOG.debug(''.join(traceback.format_stack()))      _wait /usr/lib/python2.7/site-packages/oslo_messaging/server.py:128 I'm very far from being an oslo.messaging expert, but this *appears* to be the same issue which Sahara had, namely that the RPC server needs to be started before you can safely call wait() on it:     https://bugs.launchpad.net/sahara/+bug/1546119 I've ported the fix over from Sahara and it seems to fix the issue so I'll submit to gerrit shortly. One weird thing I couldn't explain is that the bug occurs with asynchronous_workers = 2 regardless of whether queue.enabled is True or False ...
2017-07-20 17:55:40 Adam Spiers bug added subscriber Abel Navarro
2017-09-11 17:59:59 Dave McCowan barbican: importance Undecided Medium
2023-04-25 11:01:37 Grzegorz Grasza barbican: status In Progress Won't Fix