DHCP

Bug #1872118
Comment #18

Comment 18 for bug 1872118

Revision history for this message

Jorge Niedbalski (niedbalski) wrote on 2020-08-03: Re: DHCP Cluster crashes after a few hours

#18

Hello,

I am trying to setup a reproducer for the mentioned issue. I have 2
machines acting as peers with the following versions:

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal

# dpkg -l |grep -i isc-dh
ii isc-dhcp-client 4.4.1-2.1ubuntu5 amd64 DHCP client for automatically obtaining an IP address
ii isc-dhcp-common 4.4.1-2.1ubuntu5 amd64 common manpages relevant to all of the isc-dhcp packages
ii isc-dhcp-server 4.4.1-2.1ubuntu5 amd64 ISC DHCP server for automatic IP address assignment

=====

Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/

Started with:

# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4

---> Raised some DHCP requests to these servers.

balanced pool 560b8c263f40 12 total 221 free 111 backup 110 lts 0 max-misbal 33
Sending updates to failover-partner.
failover peer failover-partner: peer moves from recover-done to normal
failover peer failover-partner: Both servers normal
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4
DHCPOFFER on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPACK on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4

DHCPREQUEST for 10.19.101.120 from 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPACK on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.121 from 52:54:00:53:a3:d8 (valiant-motmot) via ens4
DHCPACK on 10.19.101.121 to 52:54:00:53:a3:d8 (valiant-motmot) via ens4

---

failover peer failover-partner: Both servers normal
balancing pool 5606b2c95f10 12 total 221 free 221 backup 0 lts -110 max-own (+/-)22
balanced pool 5606b2c95f10 12 total 221 free 221 backup 0 lts -110 max-misbal 33
balancing pool 5606b2c95f10 12 total 221 free 111 backup 110 lts 0 max-own (+/-)22
balanced pool 5606b2c95f10 12 total 221 free 111 backup 110 lts 0 max-misbal 33
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4: load balance to peer failover-partner
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 via ens4: lease owned by peer

So far (after 1.5h) no crash has been reported in any of the servers.

Questions:

1) Anything missed from the provided configuration?
2) Is this load or concurrency related? meaning a specific amount of leases needs to be allocated for this crash to happen?

I will take a look to an existing crash/coredump.

Hello,

I am trying to setup a reproducer for the mentioned issue. I have 2
machines acting as peers with the following versions:

# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.1 LTS
Release:	20.04
Codename:	focal

# dpkg -l |grep -i isc-dh
ii  isc-dhcp-client                4.4.1-2.1ubuntu5                      amd64        DHCP client for automatically obtaining an IP address
ii  isc-dhcp-common                4.4.1-2.1ubuntu5                      amd64        common manpages relevant to all of the isc-dhcp packages
ii  isc-dhcp-server                4.4.1-2.1ubuntu5                      amd64        ISC DHCP server for automatic IP address assignment

=====

Primary configuration:  https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/

Started with:

# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4

---> Raised some DHCP requests to these servers.

balanced pool 560b8c263f40 12  total 221  free 111  backup 110  lts 0  max-misbal 33
Sending updates to failover-partner.
failover peer failover-partner: peer moves from recover-done to normal
failover peer failover-partner: Both servers normal
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4
DHCPOFFER on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPACK on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4

---

failover peer failover-partner: Both servers normal
balancing pool 5606b2c95f10 12  total 221  free 221  backup 0  lts -110  max-own (+/-)22
balanced pool 5606b2c95f10 12  total 221  free 221  backup 0  lts -110  max-misbal 33
balancing pool 5606b2c95f10 12  total 221  free 111  backup 110  lts 0  max-own (+/-)22
balanced pool 5606b2c95f10 12  total 221  free 111  backup 110  lts 0  max-misbal 33
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4: load balance to peer failover-partner
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 via ens4: lease owned by peer

So far (after 1.5h) no crash has been reported in any of the servers.

Questions:

1) Anything missed from the provided configuration?
2) Is this load or concurrency related? meaning a specific amount of leases needs to be allocated for this crash to happen?

I will take a look to an existing crash/coredump.