Create router failed with The server didn't respond in time.: 504 Gateway Time-out
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
Unassigned |
Bug Description
in FS02, tempest pre resource creation tasks are failing at create router [1]
```
2021-07-21 00:26:30.827471 | primary | TASK [os_tempest : Create router] *******
2021-07-21 00:26:30.827706 | primary | Wednesday 21 July 2021 00:26:30 +0000 (0:02:13.700) 1:59:31.652 ********
2021-07-21 00:28:33.108393 | primary | FAILED - RETRYING: Create router (5 retries left).
2021-07-21 00:30:45.819402 | primary | FAILED - RETRYING: Create router (4 retries left).
2021-07-21 00:32:58.284105 | primary | FAILED - RETRYING: Create router (3 retries left).
2021-07-21 00:35:10.658736 | primary | FAILED - RETRYING: Create router (2 retries left).
2021-07-21 00:37:23.397917 | primary | FAILED - RETRYING: Create router (1 retries left).
2021-07-21 00:39:35.447171 | primary | fatal: [undercloud -> undercloud]: FAILED! => {
2021-07-21 00:39:35.447471 | primary | "attempts": 5,
2021-07-21 00:39:35.447492 | primary | "changed": false,
2021-07-21 00:39:35.447502 | primary | "extra_data": {
2021-07-21 00:39:35.447512 | primary | "data": null,
2021-07-21 00:39:35.447521 | primary | "details": "The server didn't respond in time.: 504 Gateway Time-out",
2021-07-21 00:39:35.447531 | primary | "response": "<html>
2021-07-21 00:39:35.447541 | primary | }
2021-07-21 00:39:35.447551 | primary | }
2021-07-21 00:39:35.447598 | primary |
2021-07-21 00:39:35.447607 | primary | MSG:
2021-07-21 00:39:35.447616 | primary |
2021-07-21 00:39:35.452745 | primary | Error updating router 422d9b28-
```
while looking at neutron server logs, we are seeing two errors
```
ERROR oslo.messaging.
```
and
```
2021-07-21 00:16:27.712 30 DEBUG futurist.periodics [req-144966c1-
2021-07-21 00:16:27.718 30 ERROR oslo_db.
[SQL: SELECT 1]
(Background on this error at: http://
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
2021-07-21 00:16:27.718 30 ERROR oslo_db.
```
It might be related to rabbitmq. Filing this bug to track further.
Changed in tripleo: | |
milestone: | xena-2 → xena-3 |
Changed in tripleo: | |
status: | Incomplete → In Progress |
Just the usual podman timing out on all operations:
michele@holtby /tmp/z grep -ir -e 'exceeded timeout' -e 'timed out' journal.txt |wc -l
90
Jul 21 00:12:14 overcloud- controller- 0 pacemaker- controld[ 26220]: notice: High CPU load detected: 8.280000 controller- 0 pacemaker- controld[ 26220]: notice: High CPU load detected: 8.270000 controller- 0 pacemaker- controld[ 26220]: notice: High CPU load detected: 7.850000 controller- 0 pacemaker- execd[26217] : warning: rabbitmq- bundle- podman- 0_monitor_ 60000 process (PID 173756) timed out controller- 0 pacemaker- execd[26217] : warning: rabbitmq- bundle- podman- 0_monitor_ 60000[173756] timed out after 120000ms controller- 0 pacemaker- controld[ 26220]: error: Result of monitor operation for rabbitmq- bundle- podman- 0 on overcloud- controller- 0: Timed Out controller- 0 pacemaker- controld[ 26220]: notice: Transition 56 action 10 (rabbitmq- bundle- podman- 0_monitor_ 60000 on overcloud- controller- 0): expected 'ok' but got 'error' controller- 0 pacemaker- controld[ 26220]: notice: State transition S_IDLE -> S_POLICY_ENGINE controller- 0 pacemaker- attrd[26218] : notice: Setting fail-count- rabbitmq- bundle- podman- 0#monitor_ 60000[overcloud -controller- 0]: (unset) -> 1 controller- 0 pacemaker- attrd[26218] : notice: Setting last-failure- rabbitmq- bundle- podman- 0#monitor_ 60000[overcloud -controller- 0]: (unset) -> 1626826428
Jul 21 00:12:44 overcloud-
Jul 21 00:13:14 overcloud-
Jul 21 00:13:48 overcloud-
Jul 21 00:13:48 overcloud-
Jul 21 00:13:48 overcloud-
Jul 21 00:13:48 overcloud-
Jul 21 00:13:48 overcloud-
Jul 21 00:13:48 overcloud-
Jul 21 00:13:48 overcloud-
Jul 21 00:13:49 overcloud- controller- 0 pacemaker- attrd[26218] : notice: Removing all rabbitmq-bundle-0 attributes for peer overcloud- controller- 0 controller- 0 pacemaker- controld[ 26220]: notice: Result of stop operation for rabbitmq-bundle-0 on overcloud- controller- 0: ok controller- 0 pacemaker- fenced[ 26216]: notice: Node rabbitmq-bundle-0 state is now lost controller- 0 pacemaker- controld[ 26220]: notice: Initiating stop operation rabbitmq- bundle- podman- 0_stop_ 0 locally on overcloud- controller- 0 controller- 0 pacemaker- controld[ 26220]: notice: Requesting local execution of stop operation for rabbitmq- bundle- podman- 0 on overcloud- controller- 0 controller- 0 pacemaker- execd[26217] : warning: redis-bundle- podman- 0_monitor_ 60000 process (PID 173952) timed out controller- 0 pacemaker- execd[26217] : warning: redis-bundle- podman- 0_monitor_ 60000[173952] timed out after 120000ms controller- 0 pacemaker- controld[ 26220]: error: Result of monitor operation for redis-bundle- podman- 0 on overcloud- controller- 0: Timed Out controller- 0 pacemaker- controld[ 26220]: notice: Transition 75 aborted by operation redis-bundle- podman- 0_monitor_ 60000 'create' on overcloud- controller- 0: Change in recurring result controller- 0 pacemaker- controld[ 26220]: notice: Transition 56 action 12 (redis- bundle- podman- 0_monitor_ 60000 on overcloud- controller- 0): expected 'ok' but got 'error'
Jul 21 00:13:49 overcloud-
Jul 21 00:13:49 overcloud-
Jul 21 00:13:49 overcloud-
Jul 21 00:13:49 overcloud-
Jul 21 00:13:50 overcloud-
Jul 21 00:13:50 overcloud-
Jul 21 00:13:50 overcloud-
Jul 21 00:13:50 overcloud-
Jul 21 00:13:50 overcloud-
Jul ...