Comment 2 for bug 1955478

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

From the logstash query it seems that this issue isn't as common as it looked at first glance. Most of the failures there are in one patch and seems to be related to that change.
I investigated logs from the failed job https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_141/818844/2/check/neutron-tempest-plugin-scenario-openvswitch/1415d28/testr_results.html but I didn't found anything wrong there really.

In the qrouter namespace there were proper iptable rules configured:

2021-12-21 04:47:56,864 112502 DEBUG [neutron_tempest_plugin.common.shell] Command 'sudo ip netns exec qrouter-57c936f4-06e9-4e93-968f-dbef81872b9e iptables-save' succeeded:
stderr:

stdout:
# Generated by iptables-save v1.8.4 on Tue Dec 21 04:47:56 2021
*raw
:PREROUTING ACCEPT [1242:55018]
:OUTPUT ACCEPT [986:48740]
:neutron-l3-agent-OUTPUT - [0:0]
:neutron-l3-agent-PREROUTING - [0:0]
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
COMMIT
# Completed on Tue Dec 21 04:47:56 2021
# Generated by iptables-save v1.8.4 on Tue Dec 21 04:47:56 2021
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [67:4020]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [31:1860]
:neutron-l3-agent-OUTPUT - [0:0]
:neutron-l3-agent-POSTROUTING - [0:0]
:neutron-l3-agent-PREROUTING - [0:0]
:neutron-l3-agent-float-snat - [0:0]
:neutron-l3-agent-snat - [0:0]
:neutron-postrouting-bottom - [0:0]
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-OUTPUT -d 172.24.5.182/32 -j DNAT --to-destination 10.10.210.29
-A neutron-l3-agent-POSTROUTING ! -o qg-cb0064f3-83 -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 172.24.5.182/32 -j DNAT --to-destination 10.10.210.29
-A neutron-l3-agent-float-snat -s 10.10.210.29/32 -j SNAT --to-source 172.24.5.182 --random-fully
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-l3-agent-snat -o qg-cb0064f3-83 -j SNAT --to-source 172.24.5.118 --random-fully
-A neutron-l3-agent-snat -m mark ! --mark 0x2/0xffff -m conntrack --ctstate DNAT -j SNAT --to-source 172.24.5.118 --random-fully
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
COMMIT
# Completed on Tue Dec 21 04:47:56 2021
# Generated by iptables-save v1.8.4 on Tue Dec 21 04:47:56 2021
*mangle
:PREROUTING ACCEPT [1217:54018]
:INPUT ACCEPT [1050:44138]
:FORWARD ACCEPT [160:9600]
:OUTPUT ACCEPT [967:47980]
:POSTROUTING ACCEPT [1127:57580]
:neutron-l3-agent-FORWARD - [0:0]
:neutron-l3-agent-INPUT - [0:0]
:neutron-l3-agent-OUTPUT - [0:0]
:neutron-l3-agent-POSTROUTING - [0:0]
:neutron-l3-agent-PREROUTING - [0:0]
:neutron-l3-agent-float-snat - [0:0]
:neutron-l3-agent-floatingip - [0:0]
:neutron-l3-agent-mark - [0:0]
:neutron-l3-agent-scope - [0:0]
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A INPUT -j neutron-l3-agent-INPUT
-A FORWARD -j neutron-l3-agent-FORWARD
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A neutron-l3-agent-POSTROUTING -o qg-cb0064f3-83 -m connmark --mark 0x0/0xffff0000 -j CONNMARK --save-mark --nfmask 0xffff0000 --ctmask 0xffff0000
-A neutron-l3-agent-PREROUTING -j neutron-l3-agent-mark
-A neutron-l3-agent-PREROUTING -j neutron-l3-agent-scope
-A neutron-l3-agent-PREROUTING -m connmark ! --mark 0x0/0xffff0000 -j CONNMARK --restore-mark --nfmask 0xffff0000 --ctmask 0xffff0000
-A neutron-l3-agent-PREROUTING -j neutron-l3-agent-floatingip
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff
-A neutron-l3-agent-float-snat -m connmark --mark 0x0/0xffff0000 -j CONNMARK --save-mark --nfmask 0xffff0000 --ctmask 0xffff0000
-A neutron-l3-agent-mark -i qg-cb0064f3-83 -j MARK --set-xmark 0x2/0xffff
-A neutron-l3-agent-scope -i qr-bf6710d4-5d -j MARK --set-xmark 0x4000000/0xffff0000
-A neutron-l3-agent-scope -i qr-e14b83e8-68 -j MARK --set-xmark 0x4000000/0xffff0000
-A neutron-l3-agent-scope -i qg-cb0064f3-83 -j MARK --set-xmark 0x4000000/0xffff0000
COMMIT
# Completed on Tue Dec 21 04:47:56 2021
# Generated by iptables-save v1.8.4 on Tue Dec 21 04:47:56 2021
*filter
:INPUT ACCEPT [972:39458]
:FORWARD ACCEPT [160:9600]
:OUTPUT ACCEPT [967:47980]
:neutron-filter-top - [0:0]
:neutron-l3-agent-FORWARD - [0:0]
:neutron-l3-agent-INPUT - [0:0]
:neutron-l3-agent-OUTPUT - [0:0]
:neutron-l3-agent-local - [0:0]
:neutron-l3-agent-scope - [0:0]
-A INPUT -j neutron-l3-agent-INPUT
-A FORWARD -j neutron-filter-top
-A FORWARD -j neutron-l3-agent-FORWARD
-A OUTPUT -j neutron-filter-top
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A neutron-filter-top -j neutron-l3-agent-local
-A neutron-l3-agent-FORWARD -j neutron-l3-agent-scope
-A neutron-l3-agent-INPUT -m mark --mark 0x1/0xffff -j ACCEPT
-A neutron-l3-agent-INPUT -p tcp -m tcp --dport 9697 -j DROP
-A neutron-l3-agent-scope -o qr-bf6710d4-5d -m mark ! --mark 0x4000000/0xffff0000 -j DROP
-A neutron-l3-agent-scope -o qr-e14b83e8-68 -m mark ! --mark 0x4000000/0xffff0000 -j DROP
COMMIT
# Completed on Tue Dec 21 04:47:56 2021

haproxy was started and listining on port 9697 as it should be:

2021-12-21 04:47:56,905 112502 DEBUG [neutron_tempest_plugin.common.shell] Command 'sudo ip netns exec qrouter-57c936f4-06e9-4e93-968f-dbef81872b9e netstat -nlp' succeeded:
stderr:

stdout:
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp6 0 0 :::9697 :::* LISTEN 130841/haproxy
raw 0 0 0.0.0.0:112 0.0.0.0:* 7 130587/keepalived
raw 0 0 0.0.0.0:112 0.0.0.0:* 7 130587/keepalived

But in the neutron-metadata-agent I don't see any requests to the metatada server from VM's IP 10.10.210.29. So those packets were dropped somewhere but I really don't know where :/
And I doubt it will be possible to understand the issue just based on the jobs' logs.