I think that the issue is that mistral (server) is not refreshing the token that zaqar is using. The token works for a while and expires after an hour (which is what we configure). The theory has backup data because of the log timings:
So ultimately it seems to me that it's an issue on how mistral creates the client (in a way that doesn't refresh the keystone tokens). This should have been handled already though, and it does seem to me that mistral is using sessions correctly (as far as I can tell). Are we using an old mistral container?
This would have usually gotten handled by the session object from keystoneauth1, which I thought was being used in zaqar
I don't think it's a token provider configuration issue, since we only have 2 keys by default. And the deployment seems to start and go forward for quite a while: https:/ /logs.rdoprojec t.org/56/ 542556/ 100/openstack- check/gate- tripleo- ci-centos- 7-ovb-3ctlr_ 1comp-featurese t001-master/ Z64d11a27268e46 db803351bb52f7c c25/undercloud/ home/jenkins/ overcloud_ deploy. log.txt. gz
from there I can see it goes up to step 5, and in the end it fails with this exception: No JSON object could be decoded
which I guess comes from mistral client.
At some point in the zaqar logs I can see that it fails with authorization failed: /logs.rdoprojec t.org/56/ 542556/ 100/openstack- check/gate- tripleo- ci-centos- 7-ovb-3ctlr_ 1comp-featurese t001-master/ Z64d11a27268e46 db803351bb52f7c c25/undercloud/ var/log/ containers/ zaqar/zaqar. log.txt. gz#_2018- 04-04_01_ 56_10_669
https:/
which gets reflected in the mistral executor logs here https:/ /logs.rdoprojec t.org/56/ 542556/ 100/openstack- check/gate- tripleo- ci-centos- 7-ovb-3ctlr_ 1comp-featurese t001-master/ Z64d11a27268e46 db803351bb52f7c c25/undercloud/ var/log/ containers/ mistral/ executor. log.txt. gz#_2018- 04-04_01_ 56_12_627
which is what Emilien reported.
I think that the issue is that mistral (server) is not refreshing the token that zaqar is using. The token works for a while and expires after an hour (which is what we configure). The theory has backup data because of the log timings:
The deploy starts at 0:55 /logs.rdoprojec t.org/56/ 542556/ 100/openstack- check/gate- tripleo- ci-centos- 7-ovb-3ctlr_ 1comp-featurese t001-master/ Z64d11a27268e46 db803351bb52f7c c25/undercloud/ home/jenkins/ overcloud_ deploy. log.txt. gz#_2018- 04-04_00_ 55_54
https:/
And we see the error at 1:55 /logs.rdoprojec t.org/56/ 542556/ 100/openstack- check/gate- tripleo- ci-centos- 7-ovb-3ctlr_ 1comp-featurese t001-master/ Z64d11a27268e46 db803351bb52f7c c25/undercloud/ home/jenkins/ overcloud_ deploy. log.txt. gz#_2018- 04-04_01_ 55_56
https:/
So ultimately it seems to me that it's an issue on how mistral creates the client (in a way that doesn't refresh the keystone tokens). This should have been handled already though, and it does seem to me that mistral is using sessions correctly (as far as I can tell). Are we using an old mistral container?
This would have usually gotten handled by the session object from keystoneauth1, which I thought was being used in zaqar