Comment 2 for bug 2003851

Revision history for this message
Walter (wdoekes) wrote :

Thanks for the reply. Julian.

Let's assume that the problem is indeed latency/dropped packets/whatever on our side. IMO, an (occasionally) broken network should not cause apt-get to hang indefinitely. Do you think it should?

Also, it doesn't address that the behaviour seems recent. We have not observed anything like this with Focal or Bionic.

In the mean time, I've added notification code so at least we can we track when this occurs. I'll fill you in if I get more details (like versions that are (not) affected).

----

Additional debug info:

I called recv(3..) on the http socket with a recv-q. It returns 0 (no error, EOF).
```
(gdb) call (ssize_t)recv(3, "abcdef", 1, 0)
$1 = 0
```
```
# netstat -apn | grep -E '154026|154029|154030|154031|154033'
tcp 1 0 10.91.52.91:60868 217.21.205.139:80 CLOSE_WAIT 154030/http
tcp 1 0 10.91.52.91:40756 178.128.6.101:80 CLOSE_WAIT 154029/http
tcp 0 0 10.91.52.91:56818 185.37.124.14:80 CLOSE_WAIT 154031/http
```

Next, I called close(0) on the stdin socket on one of the processes, and this awoke the whole list of tasks:
```
(gdb) call (int)close(0)
$1 = 0
```
This yielded the following error message:
```
E: Method http has died unexpectedly!
```
And the apt-get process got unstuck / was reaped succesfully.