From the 'Other Info' section. Detailed analysis of the issue and more. [Other Info] Full log: https://paste.ubuntu.com/p/8yBfw2KR5h/ Log of a working run: https://paste.ubuntu.com/p/N3ZgqrxyQD/ When you tcpdump dhclient, we see all DHCPDISCOVER packets being replied to with DHCPOFFER packets, but the got_one() callback is never called, dhclient does not read these DHCPOFFER packets, and continues sending DHCPDISCOVER packets. Once it reaches 25 DHCPDISCOVER packets sent, it gives up. This behaviour led several bug reporters to believe it was a kernel issue, with the kernel not pushing DHCPOFFER packets to dhclient. This is not the case, the actual problem is dhclient containing a thread concurrency race condition, and when the race occurs, the read socket is closed prematurely, and dhclient does not read any of the DHCPOFFER replies. tcpdump: 'Screenshot of Wireshark' attached. ... I was reading around the upstream issue trackers, and found the following two bug reports: https://gitlab.isc.org/isc-projects/dhcp/-/issues/264 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996356 The ISC upstream report was actually quite detailed, and it has the same symptoms of what we are experiencing. Let's have a look at the root cause. The code I am using is isc-dhcp 4.4.1-2.1ubuntu5.20.04.4 from Focal. common/discover.c  567 void  568 discover_interfaces(int state) { ... 1002 case AF_INET: 1003 default: 1004 status = omapi_register_io_object((omapi_object_t *)tmp, 1005 if_readsocket, 1006 0, got_one, 0, 0); 1007 break; 1008 } ... In discover.c, we call discover_interfaces() to iterate over the interfaces, and attempt to register a raw socket against it. We do this by calling omapi_register_io_object() which is used for reading data, and calls the elusive got_one() callback that you instrumented your code to see if it was being called or not. omapip/dispatch.c 196 /* Register an I/O handle so that we can do asynchronous I/O on it. */ 197 198 isc_result_t omapi_register_io_object (omapi_object_t *h, 199 int (*readfd) (omapi_object_t *), 200 int (*writefd) (omapi_object_t *), 201 isc_result_t (*reader) 202 (omapi_object_t *), 203 isc_result_t (*writer) 204 (omapi_object_t *), 205 isc_result_t (*reaper) 206 (omapi_object_t *)) 207 { ... 241 /* 242 * Attach the I/O object to the isc socket library via the 243 * fdwatch function. This allows the socket library to watch 244 * over a socket that we built. If there are both a read and 245 * a write socket we asssume they are the same socket. 246 */ 247 248 if (readfd) { 249 fd_flags |= ISC_SOCKFDWATCH_READ; 250 fd = readfd(h); 251 } ... 257 258 if (fd_flags != 0) { 259 status = isc_socket_fdwatchcreate(dhcp_gbl_ctx.socketmgr, 260 fd, fd_flags, 261 omapi_iscsock_cb, 262 obj, 263 dhcp_gbl_ctx.task, 264 &obj->fd); ... 275 } 276 277 278 /* Find the last I/O state, if there are any. */ 279 for (p = omapi_io_states.next; 280 p && p -> next; p = p -> next) 281 ; 282 if (p) 283 omapi_io_reference (&p -> next, obj, MDL); 284 else 285 omapi_io_reference (&omapi_io_states.next, obj, MDL); ... omapi_register_io_object() is called for each socket created, in this case, the if_readsocket from discover_interfaces(). The file descriptor is assigned ISC_SOCKFDWATCH_READ, and we enter the if statement. The if statement calls isc_socket_fdwatchcreate(), which registers the socket with the ISC socket manager, and sets up the callback omapi_iscsock_cb(), to be called. Once that has been done, we iterate over the omapi_io_states linked list, which is a global list of registered sockets. We get to the end of the list (or the beginning, if the list is empty), and add the socket to the list. Now, the bug happens between calling isc_socket_fdwatchcreate() to register the socket with the socket manager, and adding it to the global linked list. Sometimes, the callback omapi_iscsock_cb() is called inbetween. omapip/dispatch.c 101 /* 102 * Callback routine to connect the omapi I/O object and socket with 103 * the isc socket code. The isc socket code will call this routine 104 * which will then call the correct local routine to process the bytes. 105 * 106 * Currently we are always willing to read more data, this should be modified 107 * so that on connections we don't read more if we already have enough. 108 * 109 * If we have more bytes to write we ask the library to call us when 110 * we can write more. If we indicate we don't have more to write we need 111 * to poke the library via isc_socket_fdwatchpoke. 112 */ ... 118 #define SOCKDELETE 1 119 int 120 omapi_iscsock_cb(isc_task_t *task, 121 isc_socket_t *socket, 122 void *cbarg, 123 int flags) 124 {D ... 132 #if SOCKDELETE 133 /* 134 * walk through the io states list, if our object is on there 135 * service it. if not ignore it. 136 */ 137 for (obj = omapi_io_states.next; obj != NULL; obj = obj->next) { 138 if (obj == cbarg) 139 break; 140 } 141 142 if (obj == NULL) { 143 return(0); 144 } 145 #else ... 164 if ((flags == ISC_SOCKFDWATCH_READ) && 165 (obj->reader != NULL) && 166 (obj->inner != NULL)) { 167 status = obj->reader(obj->inner); 168 /* 169 * If we are shutting down (basically tried to 170 * read and got no bytes) we don't need to try 171 * again. 172 */ 173 if (status == ISC_R_SHUTTINGDOWN) 174 return (0); 175 /* Otherwise We always ask for more when reading */ 176 return (1); ... 188 /* 189 * We get here if we either had an error (inconsistent 190 * structures etc) or no more to write, tell the socket 191 * lib we don't have more to do right now. 192 */ 193 return (0); 194 } When omapi_iscsock_cb() is called, we iterate over that same omapi_io_states global linked list, and since the raw socket we are about to add is the very first item going to be placed on the list, omapi_io_states.next will be NULL, making obj set to NULL. Well what happens when obj is NULL? We return 0. 142 if (obj == NULL) { 143 return(0); 144 } If you look down the code a little further, we see that if we are finished with the socket, like shutting down, or an error happens, we return 0 to indicate that we are done with this socket, and won't be needing it anymore, and won't be using it again. If we return 1, then we ask for more data later on, and omapi_iscsock_cb() will be called again in the future. So, by returning 0, we will no longer read any packets from the socket, and thus, we simply ignore and will never read any DHCPOFFER packets. We obviously have two sockets open, one for reading and one for writing, since one is bound to port 67 and the other to port 68, so we keep sending out DHCPDISCOVER packets, not knowing we are ignoring all DHCPOFFERS due to the read socket being closed and not calling omapi_iscsock_cb() ever again. So, as explained on the upstream bug, what is needed for this race condition to occur? 1) There needs to be data that can be read immediately by the socket. This is how omapi_iscsock_cb() is called, when there is data waiting. If there is no data to read, then the race can't happen. 2) Thread scheduling must occur in a particular order. There are three threads: a. The main thread, which is registering the socket b. The socket manager coordinator thread c. A socket manager worker thread For the race to occur, we need to switch from a) to b) to schedule the callback, b) to c) to call the callback and return 0, all before we return from c) to a) to add the socket to the omapi_io_states linked list. To verify this, I added some print statements to omapi_iscsock_cb() and omapi_register_io_object() to see what is happening: https://paste.ubuntu.com/p/hGsssrVyG4/ I instrumented entry to the callback, as well as if obj is NULL, and before and after the socket is registered to the socket manager, and before and after the socket is added to the global linked list. I built the new dhclient, and ran test-parallel.sh, since it by far is the best at reproducing the issue. I concatenated all 20 /tmp/dhclient-* logs to read: https://paste.ubuntu.com/p/VddddSmdMV/ If we look at one that works and gets the DHCPOFFER, we see: https://paste.ubuntu.com/p/dfdvNM3RDd/ We have two sockets, one that reads and one that writes. It registers with the socket manager, adds the socket to the linked list, for both sockets, one after the other, and then the DHCPDISOVER is set, and then callbacks are called. Now, when the issue reproduces: https://paste.ubuntu.com/p/77BNNhDwdh/ We see that we register the read socket with the socket manager, and immediately get a callback to omapi_iscsock_cb(), so there is likely a packet to read already. We return NULL, since the socket is not on the linked list yet, and strangely this happens twice when we have only entered omapi_iscsock_cb() once. We then add the socket to the linked list, showing we have hit the race condition. The writer socket is added to the socket manager, but does not race, as it is added to the linked list before the callback happens. The callback only seems to be called for the writer socket, and we ignore all DHCPOFFER packets, only sending DHCPDISCOVER packets. I ran the test-parallel.sh script a few times, and each and every time the issue reproduced, we had a NULL linked list, and returned 0 from omapi_iscsock_cb(). So I am confident we have come across the root cause. Now, I had an idea of adding a sleep between registering to the socket manager and adding the socket to the linked list, to try and get every single agent to fail 100% of the time, hoping that the callback would be called first. The change is something like the likes of below: https://paste.ubuntu.com/p/8zJ2FQK99X/ But after I built it and ran it a few times, I couldn't reproduce the issue at all. A typical run is below: https://paste.ubuntu.com/p/k6wRhx4RCM/ We can see that everything happens in the correct order, and the issue is not reproduced. I think because each dhclient has to wait 2 seconds before sending a DHCPDISCOVER, the network is quiet, and there is no packet to read immediately, since neighboring agents are also quiet, so the callback is not called, and the socket is added to the linked list before the network gets busy. The code itself to add the socket to the linked list was added in the below commit, 22 years ago: commit 61b844bfd7641a62d681a1f70d3b6dbc485ce4b6 From: Ted Lemon