keepalived takes VIP and therefore the whole active L3 HA router down on SIGHUP for configuration reload

Bug #1818682 reported by Marcus Klein on 2019-03-05
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Undecided
Unassigned

Bug Description

It sometimes happens, when we spawn VMs, that the requests of cloud-init inside the VM to the metadata agent are refused. This seems to be a timing problem as this happens with fast booting images more often than with slowly booting images. Error message for the request is "Connection refused". Some seconds later the exact same request works without any problems.

Our deployment is just upgraded from Ocata to Pike and neutron-ns-metadata-proxy was replaced with haproxy. Since this change, the problem occurs. Our setup uses Open vSwitch, self service networks and network nodes for L3 router, metadata agent, dhcp agent are separated from compute nodes and controller nodes. We use Ubuntu Cloud Archive repositories to install on Ubuntu 16.04 LTS.

15:57:12.780152 IP (tos 0x0, ttl 64, id 7253, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.5.3.59378 > 169.254.169.254.http: Flags [S], cksum 0xebec (correct), seq 4230673254, win 29200, options [mss 1460,sackOK,TS val 2933213616 ecr 0,nop,wscale 7], length 0
15:57:12.780208 IP (tos 0x0, ttl 64, id 6932, offset 0, flags [DF], proto TCP (6), length 40)
    169.254.169.254.http > 192.168.5.3.59378: Flags [R.], cksum 0xbe52 (correct), seq 0, ack 4230673255, win 0, length 0

The TCP SYN package to the metadata IP is replied with a TCP RST,ACK package. The image does not try again to connect to the metadata agent and is from then on not usable due to missing injection of public SSH key.

Brian Haley (brian-haley) wrote :

Does https://review.openstack.org/#/c/524406/ or it's Pike version fix the issue? It would cause the iptables rules to get created earlier so that if a packet arrived before haproxy was running it would just be dropped instead of an RST being sent.

Changed in neutron:
status: New → Incomplete
Marcus Klein (kleini76) wrote :

15:36:04.972762 IP (tos 0x0, ttl 64, id 19630, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.5.10.42702 > 169.254.169.254.http: Flags [S], cksum 0x16b0 (correct), seq 2272764745, win 29200, options [mss 1460,sackOK,TS val 726116457 ecr 0,nop,wscale 7], length 0
15:36:04.972806 IP (tos 0x0, ttl 64, id 49307, offset 0, flags [DF], proto TCP (6), length 40)
    169.254.169.254.http > 192.168.5.10.42702: Flags [R.], cksum 0xc440 (correct), seq 0, ack 2272764746, win 0, length 0

I just reproduce the issue again.

Deployed version is 2:11.0.6-0ubuntu1~cloud0 and the code already includes this patch
https://review.openstack.org/#/c/598980/

Marcus Klein (kleini76) wrote :

What makes this new connection to 169.254.169.254 being refused? I don't see any REJECT rule in the iptables setup, only DROP. Is that haproxy that refuses the new connection?

If I change manually the IP address of the VM requests to 169.254.169.254 are just dropped or their packages are dropped but they are not rejected.

Marcus Klein (kleini76) wrote :

All requested information is provided. Status back to new.

Changed in neutron:
status: Incomplete → New
Brian Haley (brian-haley) wrote :

So if I kill the haproxy for a network, instances will get a:

$ wget 169.254.169.254
Connecting to 169.254.169.254 (169.254.169.254:80)
wget: can't connect to remote host (169.254.169.254): Connection refused

And that's with the iptables rules in place. I guess I'd expect that since we've said to let the packet in, but there is no one listening. Is this the problem you're seeing?

Marcus Klein (kleini76) wrote :

There are already some VMs running in that network connected to the according L3 router. I see haproxy happily running. After launching a new VM, haproxy has a different PID. The same happens, when a VM is destroyed.

root@network2:~# while (true); do ps afx | grep 48d5 | grep haproxy | wc -l; sleep 0.5; done
1
1
0
0
0
0
0
0
0
1
1

That means, that at least there is no haproxy listening for 3.5 seconds. Nova tells me, that launching a VM takes sometimes only 5 seconds. So it might happen, that the VM already runs cloud-init while haproxy is not running within the L3 router network namespace. eeks!

Are there any possible solutions for that?

Marcus Klein (kleini76) wrote :

I used the same command, when destroying the VM:

root@network2:~# while (true); do ps afx | grep 48d5 | grep haproxy | wc -l; sleep 0.5; done
1
1
0
0
0
0
0
0
0
3
1

There are 3 haproxy instances running for a short moment?

Brian Haley (brian-haley) wrote :

I don't see the issue you're describing, there is always only 1 haproxy running, and it's always the same one. Perhaps there is a fix in another area that needs a backport? I was testing on 18.04 with the master branch of neutron.

Marcus Klein (kleini76) wrote :
Download full text (4.7 KiB)

I reproduce it again, this time with more information about times and running processes.

The refused request to metadata agent:

11:10:33.174494 IP (tos 0x0, ttl 64, id 9015, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.5.3.38706 > 169.254.169.254.http: Flags [S], cksum 0x167a (correct), seq 2032783790, win 29200, options [mss 1460,sackOK,TS val 166508934 ecr 0,nop,wscale 7], length 0
11:10:33.174549 IP (tos 0x0, ttl 64, id 25829, offset 0, flags [DF], proto TCP (6), length 40)
    169.254.169.254.http > 192.168.5.3.38706: Flags [R.], cksum 0xb3cc (correct), seq 0, ack 2032783791, win 0, length 0

And my monitoring about running haproxy instances in the according L3 router namespace:

root@network2:~# while (true); do date +%H:%M:%S.%N; ps afx | grep haproxy | grep 48d5a991-1327-49c9-88d9-e28b6f48c550; sleep 0.5; done
11:10:13.250165525
21719 ? Ss 0:00 haproxy -f /var/lib/neutron/ns-metadata-proxy/48d5a991-1327-49c9-88d9-e28b6f48c550.conf
11:10:13.806491785
21719 ? Ss 0:00 haproxy -f /var/lib/neutron/ns-metadata-proxy/48d5a991-1327-49c9-88d9-e28b6f48c550.conf
11:10:14.364991544
11:10:14.921073081
11:10:15.477189579
11:10:16.033293688
11:10:16.589044946
11:10:17.144677609
11:10:17.700802400
11:10:18.257267466
11:10:18.815026800
11:10:19.371615067
11:10:19.928728339
11:10:20.484549005
11:10:21.043642432
11:10:21.600246727
11:10:22.157465640
11:10:22.712933864
11:10:23.269805384
11:10:23.826738954
11:10:24.383053951
11:10:24.940974289
11:10:25.499760746
11:10:26.057371940
11:10:26.615289561
11:10:27.171769464
11:10:27.728147598
11:10:28.284904266
11:10:28.841412946
11:10:29.398557841
11:10:29.955469233
11:10:30.511146389
11:10:31.067153906
11:10:31.623029038
11:10:32.178155674
11:10:32.733818364
11:10:33.288982881
11:10:33.844176215
11:10:34.399436000
11:10:34.955462896
11:10:35.513774777
11:10:36.069739028
11:10:36.624940323
11:10:37.179853955
11:10:37.735817089
11:10:38.292526172
11:10:38.847372016
11:10:39.402413814
11:10:39.957459099
11:10:40.513074480
11:10:41.068534980
11:10:41.624116351
11:10:42.179488577
11:10:42.734747693
11:10:43.289827243
11:10:43.845511714
11:10:44.400984233
11:10:44.955548219
11:10:45.511576855
11:10:46.068453602
23804 ? S 0:00 \_ sudo /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ip netns exec qrouter-48d5a991-1327-49c9-88d9-e28b6f48c550 haproxy -f /var/lib/neutron/ns-metadata-proxy/48d5a991-1327-49c9-88d9-e28b6f48c550.conf
23805 ? R 0:00 \_ /usr/bin/python /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ip netns exec qrouter-48d5a991-1327-49c9-88d9-e28b6f48c550 haproxy -f /var/lib/neutron/ns-metadata-proxy/48d5a991-1327-49c9-88d9-e28b6f48c550.conf
11:10:46.622401965
23813 ? Ss 0:00 haproxy -f /var/lib/neutron/ns-metadata-proxy/48d5a991-1327-49c9-88d9-e28b6f48c550.conf
11:10:47.179850306
23813 ? Ss 0:00 haproxy -f /var/lib/neutron/ns-metadata-proxy/48d5a991-1327-49c9-88d9-e28b6f48c550.conf

And this clarifies, too, why the count in my previous comment can be higher than 1: This are the maintenance commands for stopping and starting haproxy.

And extracts from the Nova log:
2019-03-08 11:09:58.134 11477 INFO nova....

Read more...

Marcus Klein (kleini76) wrote :

What can cause this large time frame between deleting the haproxy instance and recreating it again?

Marcus Klein (kleini76) wrote :

Or why is haproxy stopped and restarted at all?

Marcus Klein (kleini76) wrote :

Every VM stop and start causes haproxy to be restarted. Most of the time it is less than 3 seconds, that haproxy is not running and sometimes it is up to 30 seconds like documented above.

I don't see any difference in the configuration of haproxy, therefore it does not make any sense to restart it. Why is it restarted? And what makes this restart sometimes delay for 30 seconds. I don't see any significant load or iowait on the network or controller node during that time.

We will continue our rolling upgrade to Queens and Rocky and pray, that the problem might go away.

Brian Haley (brian-haley) wrote :

As I mentioned earlier, I do not see haproxy getting restarted when I create/delete instances. I also haven't found (yet?) any change that fixed such a condition. Don't know how else I can help until you can reproduce it on something later like Rocky.

Changed in neutron:
status: New → Incomplete
Marcus Klein (kleini76) wrote :

I started once again to debug this issue in our setup. Some more insights came up when looking again into the logs. When a VM is spawned the following two messages appear in the neutron-l3-agent.log:

2019-04-10 14:22:31.503 30752 INFO neutron.agent.l3.ha [-] Router 48d5a991-1327-49c9-88d9-e28b6f48c550 transitioned to backup
019-04-10 14:22:37.492 30752 INFO neutron.agent.l3.ha [-] Router 48d5a991-1327-49c9-88d9-e28b6f48c550 transitioned to master

This is visible through neutron API, too:

Alle 0,1s: neutron l3-agent-list-hosting-router 48d5a991-1327-49c9-88d9-e28b6f48c550 marcus: Wed Apr 10 14:22:35 2019

neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+----------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+----------+----------------+-------+----------+
| 14143f12-1732-4596-8423-42c11d5f73eb | network2 | True | :-) | standby |
| 1ba861cb-b049-47c1-9470-9050975ec5d0 | network1 | True | :-) | standby |
+--------------------------------------+----------+----------------+-------+----------+

The L3 router was previously on network2 and appears then again as active on network2.

My next steps will now be to investigate, why this transition of the L3 router into backup state happens.

Marcus Klein (kleini76) on 2019-04-11
summary: - HAproxy for metadata refuses connection from VM cloud-init
+ keepalived takes VIP and therefore the whole active L3 HA router down on
+ SIGHUP for configuration reload
Marcus Klein (kleini76) wrote :

Further debugging finally puts the cause of the whole issue on the table:

This issue is caused by associating or disassociating floating IPs. This operation causes the keepalived.conf of the L3 HA router to be extended on the VIP exclude list with the dis-/associated floating IP. keepalived then receives a SIGHUP to reload its configuration. Due to the SIGHUP keepalived immediately drops the monitor VIP from the HA interface of the actual active L3 router.
neutron-keepalived-state-change notices the dropped VIP and immediately takes down everything active for the L3 router including the haproxy for the metadata agent.

Changed in neutron:
status: Incomplete → New
Marcus Klein (kleini76) wrote :

Ubuntu xenial LTS comes with keepalived 1.2.19
xenial-updates contains keepalived 1.2.24

The problem is reproducible with keepalived 1.2.24. It is not reproducible with keepalived 1.2.19.

Slawek Kaplonski (slaweq) wrote :

I remember we had some issues with 1.2.24 version of keepalived in the past. See https://bugs.launchpad.net/neutron/+bug/1788185

So according to Your last comment, I will mark this bug as invalid in neutron as it looks that it is issue in one specific keepalived versions.

Changed in neutron:
status: New → Invalid
Marcus Klein (kleini76) wrote :

Where can I find information about compatible versions of keepalived and all the other tools used by Neutron?

Slawek Kaplonski (slaweq) wrote :

@Marcus: I don't think we have any exact list of such software.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers