centos8 standalone-upgrade-ussuri fails tempest ping router IP
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Alex Schultz |
Bug Description
At [1] the tripleo-
2020-09-15 15:02:07.057448 | primary | TASK [os_tempest : Ping router ip address] *******
2020-09-15 15:02:07.057502 | primary | Tuesday 15 September 2020 15:02:07 +0000 (0:00:00.065) 1:10:33.351 *****
2020-09-15 15:02:10.745010 | primary | FAILED - RETRYING: Ping router ip address (5 retries left).
2020-09-15 15:02:24.365896 | primary | FAILED - RETRYING: Ping router ip address (4 retries left).
2020-09-15 15:02:38.005903 | primary | FAILED - RETRYING: Ping router ip address (3 retries left).
2020-09-15 15:02:51.638488 | primary | FAILED - RETRYING: Ping router ip address (2 retries left).
2020-09-15 15:03:05.266932 | primary | FAILED - RETRYING: Ping router ip address (1 retries left).
2020-09-15 15:03:18.902046 | primary | fatal: [undercloud]: FAILED! => {
2020-09-15 15:03:18.902122 | primary | "attempts": 5,
Poking a little further I see a few errors related to network and mysql however I am not sure which is the original/root cause.
At [4] neutron and [5] keystone containers log we see the following many times:
2020-09-11 07:41:16.044 139 ERROR oslo_db.
At [6] ovn_controller.log there is the following repeated many times:
At [7] container-
At [8] pacemaker log we have the following a few times:
Sep 15 14:12:32 standalone.
Sep 15 14:12:32 standalone.
[1] https:/
[2] https:/
[3] https:/
[4] https:/
[5] https:/
[6] https:/
[7] https:/
[8] https:/
tags: | added: promotion-blocker |
Changed in tripleo: | |
assignee: | nobody → Sergii Golovatiuk (sgolovatiuk) |
Changed in tripleo: | |
assignee: | Sergii Golovatiuk (sgolovatiuk) → Alex Schultz (alex-schultz) |
Changed in tripleo: | |
assignee: | Alex Schultz (alex-schultz) → nobody |
Changed in tripleo: | |
status: | Fix Released → Triaged |
tags: | added: train-backport-potential ussuri-backport-potential |
Changed in tripleo: | |
milestone: | victoria-3 → wallaby-1 |
Changed in tripleo: | |
milestone: | wallaby-1 → wallaby-2 |
Changed in tripleo: | |
milestone: | wallaby-2 → wallaby-3 |
Changed in tripleo: | |
milestone: | wallaby-3 → wallaby-rc1 |
Changed in tripleo: | |
milestone: | wallaby-rc1 → xena-1 |
Changed in tripleo: | |
milestone: | xena-1 → xena-2 |
Changed in tripleo: | |
milestone: | xena-2 → xena-3 |
spent some more time digging at logs. I am not clear yet if this is an issue with HA/mysql or if it is an issue with ovs/ovn networking. I am leaning towards networking at the moment.
I'll reach out to network and pidone squads to check here - adding pointers to some error messages in the logs I came across just now:
* https:/ /storage. bhs.cloud. ovh.net/ v1/AUTH_ dcaab5e32b234d5 6b626f72581e364 4c/zuul_ opendev_ logs_ee1/ 739457/ 27/check/ tripleo- ci-centos- 8-standalone- upgrade- ussuri/ ee18aa6/ logs/undercloud /var/log/ containers/ mysql/mysqld. log
* 2020-09-15 14:32:08 0 [Note] InnoDB: Starting shutdown... mysqld: Shutdown complete f754-11ea- ac23-9b5df17a20 4a:8702, safe_to_bootstrap: 1 mysqld: ready for connections. mysql/mysql. sock' port: 3306 MariaDB Server
2020- 09-15 14:32:31 0 [Note] InnoDB: Buffer pool(s) load completed at 200915 14:32:31
* 2020-09-15 14:32:09 0 [Note] /usr/libexec/
* 2020-09-15 14:32:30 0 [Note] WSREP: Found saved state: cebd6089-
* 2020-09-15 14:32:30 0 [Note] /usr/libexec/
Version: '10.3.17-MariaDB' socket: '/var/lib/
* 2020-09-15 14:38:29 259 [Warning] Aborted connection 259 to db: 'nova_api' user: 'nova_api' host: '192.168.24.1' (Got an error reading communication packets)
* https:/ /storage. bhs.cloud. ovh.net/ v1/AUTH_ dcaab5e32b234d5 6b626f72581e364 4c/zuul_ opendev_ logs_ee1/ 739457/ 27/check/ tripleo- ci-centos- 8-standalone- upgrade- ussuri/ ee18aa6/ logs/undercloud /var/log/ containers/ openvswitch/ ovn-controller. log
* 2020-09- 15T14:41: 08.575Z| 00051|lflow| WARN|Dropped 19 log messages in last 1622 seconds (most recently, 1607 seconds ago) due to excessive rate 15T14:50: 41.820Z| 00004|fatal_ signal( ovn_pinctrl0) |WARN|terminati ng with signal 15 (Terminated)
* 2020-09-
* https:/ /storage. bhs.cloud. ovh.net/ v1/AUTH_ dcaab5e32b234d5 6b626f72581e364 4c/zuul_ opendev_ logs_ee1/ 739457/ 27/check/ tripleo- ci-centos- 8-standalone- upgrade- ussuri/ ee18aa6/ logs/undercloud /var/log/ containers/ openvswitch/ ovsdb-server- sb.log 15T13:16: 17.971Z| 00002|ovsdb_ server| INFO|ovsdb- server (Open vSwitch) 2.12.0 15T13:16: 19.947Z| 00005|reconnect |WARN|unix# 6: connection dropped (Connection reset by peer) 15T14:12: 32.939Z| 00005|reconnect |WARN|unix# 0: connection dropped (Broken pipe) 15T14:59: 56.235Z| 00002|daemon_ unix(monitor) |INFO|pid 152 died, exit status 0, exiting
* 2020-09-
* 2020-09-
* 2020-09-
* 2020-09-
* https:/ /storage. bhs.cloud. ovh.net/ v1/AUTH_ dcaab5e32b234d5 6b626f72581e364 4c/zuul_ opendev_ logs_ee1/ 739457/ 27/check/ tripleo- ci-centos- 8-standalone- upgrade- ussuri/ ee18aa6/ logs/undercloud /var/log/ containers/ stdouts/ ovn-dbs- bundle. log
* 2020-09- 15T13:16: 18.332804574+ 00:00 stderr F (operation_ finished) notice: ovndb_servers_ start_0: 48:stderr [ ovn-nbctl: transaction error: {"details":"insert operation not allowed when database server is in read only mode","error":"not allowed"} ]