CS8/9 OVB FS001/FS035 master/wallaby/train failing with DB Connection errors
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
periodic-
FATAL | Check Keystone service status | undercloud | item=swift | error={
....
Failed to create service swift: Server Error for url: https:/
https:/
From Keystone logs on controller-1 we have the following error:
[Tue Jul 12 03:51:12.972306 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] mod_wsgi (pid=28): Exception occurred processing WSGI script '/var/www/
[Tue Jul 12 03:51:12.979653 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] Traceback (most recent call last):
[Tue Jul 12 03:51:12.979701 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] File "/usr/lib64/
[Tue Jul 12 03:51:12.979705 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] self.engine.
[Tue Jul 12 03:51:12.979710 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] File "/usr/lib64/
[Tue Jul 12 03:51:12.979713 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] dbapi_connectio
[Tue Jul 12 03:51:12.979718 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] File "/usr/lib/
[Tue Jul 12 03:51:12.979721 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] self._read_
[Tue Jul 12 03:51:12.979725 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] File "/usr/lib/
[Tue Jul 12 03:51:12.979728 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] pkt = self._read_packet()
[Tue Jul 12 03:51:12.979732 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] File "/usr/lib/
[Tue Jul 12 03:51:12.979735 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] packet.
[Tue Jul 12 03:51:12.979739 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] File "/usr/lib/
[Tue Jul 12 03:51:12.979741 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] err.raise_
[Tue Jul 12 03:51:12.979746 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] File "/usr/lib/
[Tue Jul 12 03:51:12.979748 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] raise errorclass(errno, errval)
[Tue Jul 12 03:51:12.979762 2022] [wsgi:error] [pid 28] [remote 172.17.0.129:60904] pymysql.
Log from another run:
https:/
summary: |
- CentOS-8 fs001 train failing on overcloud deploy: Keystone 500 Internal - Server Error + CS8/9 FS001/FS035 master/wallaby/train failing with DB Connection errors |
summary: |
- CS8/9 FS001/FS035 master/wallaby/train failing with DB Connection errors + CS8/9 OVB FS001/FS035 master/wallaby/train failing with DB Connection + errors |
looking at haproxy logs, we can see mysql server oscillation:
https:/ /logserver. rdoproject. org/32/ 40932/9/ check/periodic- tripleo- ci-centos- 8-ovb-3ctlr_ 1comp-featurese t001-train/ 52ef39f/ logs/overcloud- controller- 1/var/log/ containers/ haproxy/ haproxy. log.txt. gz
Jul 12 03:53:09 overcloud- controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 1.internalapi. localdomain is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 15ms. 0 active and 2 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 1.internalapi. localdomain is UP, reason: Layer7 check passed, code: 200, info: "OK", check duration: 15ms. 0 active and 3 backup servers online. Running on backup. 0 sessions requeued, 0 total in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 1.internalapi. localdomain is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 17ms. 0 active and 2 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 1.internalapi. localdomain is UP, reason: Layer7 check passed, code: 200, info: "OK", check duration: 16ms. 0 active and 3 backup servers online. Running on backup. 0 sessions requeued, 0 total in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 1.internalapi. localdomain is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 16ms. 0 active and 2 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 1.internalapi. localdomain is UP, reason: Layer7 check passed, code: 200, info: "OK", check duration: 14ms. 0 active and 3 backup servers online. Running on backup. 0 sessions requeued, 0 total in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 2.internalapi. localdomain is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 14ms. 0 active and 2 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 2.internalapi. localdomain is UP, reason: Layer7 check passed, code: 200, info: "OK", check duration: 14ms. 0 active and 3 backup servers online. Running on backup. 0 sessions requeued, 0 total in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 2.internalapi. localdomain is DOWN, reason: Layer4 timeout, check duration: 1000ms. 0 active and 2 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue. controller- 1 haproxy[12]: Backup Server mysql/overcloud -controller- 2.internalapi. localdomain is UP, reason: Layer7 check passed, code: 20...
Jul 12 03:53:29 overcloud-
Jul 12 03:54:10 overcloud-
Jul 12 03:54:12 overcloud-
Jul 12 03:54:39 overcloud-
Jul 12 03:54:42 overcloud-
Jul 12 03:55:41 overcloud-
Jul 12 03:55:43 overcloud-
Jul 12 03:57:36 overcloud-
Jul 12 03:57:37 overcloud-