Adding a random lease_time value for dhcp-agent in large scale environment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Won't Fix
|
Wishlist
|
siyingchun |
Bug Description
In our large scale environment, we sometimes found it can't be guaranteed when booting a large number of new instances at the same time. Meanwhile, the lease time from all these instances will also age simultaneously. In addition, it will cause a burst of the network traffic of dhcp broadcast for a while.
According to dhcp-agent, it's simply dealt with by the key word "dhcp_lease_
So our team modified this issue by adding another value, which is called dhcp_lease_random, with a random number. And it's used by dnsmasq for being plus the value when the dhcp server gives client a real lease time.
Here we use the modulo(%) operator with part of the network_id, and the modulus is the dhcp_lease_random.
* conditions:
You'd better have a large scale environment which hosts over around 300 VMs and create or delete them at the same time. Or creating them and watching them after the lease time.
* Version:
Openstack Newton, deployed with Fuel 10.0
Ubuntu Ubuntu 16.04.1 LTS, running kernel 4.4.0-57-generic
Neutron version 5.1.0
Dnsmasq version 2.75
Changed in neutron: | |
assignee: | nobody → siyingchun (wintersi) |
tags: | added: loadimpact |
Changed in neutron: | |
status: | New → Confirmed |
Changed in neutron: | |
status: | Confirmed → In Progress |
I fear that this is a bit of an edge case, in reality leases won't get renewed exactly at the same time due to the distributed nature of the boot process for every single VM; adding a splay can increase the randomness, but I wonder if that comes at the expenses of complicating troubleshooting.