tripleo

Failed openstack-zaqar service after undercloud upgrade

Bug #1661227 reported by Marius Cornea on 2017-02-02

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	tripleo	Fix Released	High	Unassigned	tripleo pike-rc1

Bug Description

After undercloud upgrade from Newton to Ocata Zaqar service shows as failed:

[root@undercloud-0 stack]# systemctl status openstack-zaqar.service
● openstack-zaqar.service - OpenStack Message Queuing Service (code-named Zaqar) Server
Loaded: loaded (/usr/lib/systemd/system/openstack-zaqar.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2017-02-02 11:21:48 UTC; 50min ago
Main PID: 23023 (code=exited, status=1/FAILURE)

Feb 02 11:03:05 undercloud-0.redhat.local systemd[1]: Started OpenStack Message Queuing Service (code-named Zaqar) Server.
Feb 02 11:03:05 undercloud-0.redhat.local systemd[1]: Starting OpenStack Message Queuing Service (code-named Zaqar) Server...
Feb 02 11:21:48 undercloud-0.redhat.local systemd[1]: openstack-zaqar.service: main process exited, code=exited, status=1/FAILURE
Feb 02 11:21:48 undercloud-0.redhat.local systemd[1]: Unit openstack-zaqar.service entered failed state.
Feb 02 11:21:48 undercloud-0.redhat.local systemd[1]: openstack-zaqar.service failed.

zaqar.log:

2017-02-02 11:21:48.137 23023 WARNING keystonemiddleware.auth_token [-] Using the in-process token cache is deprecated as of the 4.2.0 release and may be removed in the 5.0.0 release or the 'O' development cycle. The in-process cache causes inconsistent results and high memory usage. When the feature is removed the auth_token middleware will not cache tokens by default which may result in performance issues. It is recommended to use memcache for the auth_token token cache by setting the memcached_servers option.
2017-02-02 11:21:48.610 23023 CRITICAL zaqar [(None,) - - - - -] [project_id:48f0d1803e48484087e4a5868c279e6e] IOError: [Errno 32] Broken pipe
2017-02-02 11:21:48.610 23023 ERROR zaqar Traceback (most recent call last):
2017-02-02 11:21:48.610 23023 ERROR zaqar File "/usr/bin/zaqar-server", line 10, in <module>
2017-02-02 11:21:48.610 23023 ERROR zaqar sys.exit(run())
2017-02-02 11:21:48.610 23023 ERROR zaqar File "/usr/lib/python2.7/site-packages/zaqar/common/cli.py", line 58, in _wrapper
2017-02-02 11:21:48.610 23023 ERROR zaqar _fail(1, ex)
2017-02-02 11:21:48.610 23023 ERROR zaqar File "/usr/lib/python2.7/site-packages/zaqar/common/cli.py", line 36, in _fail
2017-02-02 11:21:48.610 23023 ERROR zaqar print(ex, file=sys.stderr)
2017-02-02 11:21:48.610 23023 ERROR zaqar IOError: [Errno 32] Broken pipe
2017-02-02 11:21:48.610 23023 ERROR zaqar

Workaround: systemctl restart openstack-zaqar.service

Tags:

Revision history for this message

Emilien Macchi (emilienm) wrote on 2017-02-02:

it's weird we don't see it in TripleO CI (we have an undercloud upgrade job and it works).

Changed in tripleo:
status:	New → Triaged
importance:	Undecided → High
milestone:	none → ocata-rc1
tags:	added: upgrade

Revision history for this message

Thomas Herve (therve) wrote on 2017-02-07:

OK, I tracked it down and I think I'm getting somewhere: I'm able to reproduce it when journald is restarted. I thought about it because it of the stderr error, and because it seems to happen on upgrades mostly.

https://bugs.freedesktop.org/show_bug.cgi?id=84923 ought to be the culprit, though I wasn't able to verify if the environments have the fix or not.

Ideally, I would track down stdout/stderr usage in Zaqar to remove that issue. It's possible that it doesn't use loggers correctly, so get into that issue more than other services.

In the mean time. I proposed https://review.rdoproject.org/r/#/c/4941/ which ought to workaround the issue by restarting zaqar when it fails.

Revision history for this message

Thomas Herve (therve) wrote on 2017-02-07:

I closed bug #1640600 as a duplicate.

Emilien Macchi (emilienm) on 2017-02-16

Changed in tripleo:
milestone:	ocata-rc1 → ocata-rc2

Emilien Macchi (emilienm) on 2017-03-06

Changed in tripleo:
milestone:	ocata-rc2 → pike-1

Emilien Macchi (emilienm) on 2017-04-11

Changed in tripleo:
milestone:	pike-1 → pike-2

Emilien Macchi (emilienm) on 2017-06-08

Changed in tripleo:
milestone:	pike-2 → pike-3

Emilien Macchi (emilienm) on 2017-07-30

Changed in tripleo:
milestone:	pike-3 → pike-rc1

Revision history for this message

Ben Nemec (bnemec) wrote on 2017-08-09:

I'm going to close this since from a TripleO perspective https://review.rdoproject.org/r/#/c/4941/ fixes the failure.

Changed in tripleo:
status:	Triaged → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

Bug #1640600

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

freedesktop-bugs #84923
[RESOLVED FIXED] Edit

Bug watches keep track of this bug in other bug trackers.