Snapd stuck after a request timeout error

Bug #1891618 reported by Sebastien Bacher
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
Critical
Unassigned

Bug Description

Today snapd is stucked refreshing a snap on my focal system

$ snap version
snap 2.45.1+git1648.gf492c71
snapd 2.45.1+git1648.gf492c71
series 16
ubuntu 20.04
kernel 5.6.0-1007-oem

$ snap changes
...
315 Doing today at 08:59 CEST - Actualiser automatiquement les paquets Snap "openstackclients", "ubuntu-bug-triage", "snapcraft"

$ snap watch 315
Télécharger un paquet Snap "ubuntu-bug-triage" (205) à partir du canal "latest/stable" 56% 4.46MB/s 615ms

but it's stuck on 56% for over an hour now (on a 19M snap)

doing Ctrl-C doesn't exit the snap watch but just display a new line on the buffer and keeps updating the status

$ systemctl status snapd
...
snapd[1336]: 2020/08/14 09:00:03 Unsolicited response received on idle HTTP channel starting with "HTTP/1.0 408 Request Time-out\r\nCache-Control: no-cache\r\nConnection: close\r\nContent-Type: text/html\r\n\r\n<html><body><h1>408 Request Time-out</h1>\nYour browser didn't send a complete request in time.\n</body></html>\n"; err=<nil>

Changed in snapd:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Sebastien Bacher (seb128) wrote :
Changed in snapd:
importance: High → Critical
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

This looks like a http / go bug:

 snapd[1336]: 2020/08/14 09:00:03 Unsolicited response received on idle HTTP channel starting with "HTTP/1.0 408 Request Time-out\r\nCache-Control: no-cache\r\nConnection: close\r\nContent-Type: text/html\r\n\r\n<html><body><h1>408 Request Time-out</h1>\nYour browser didn't send a complete request in time.\n</body></html>\n"; err=<nil>

It seems that we make a request, some part of the stack thinks that request has timed out and forgets about it but then we get a response that the stack thinks is unsolicited.

summary: - Snapd stucked after a request timeout error
+ Snapd stuck after a request timeout error
a59ff5 (a59ff5a59ff5)
Changed in snapd:
status: Confirmed → Invalid
status: Invalid → Fix Released
Changed in snapd:
status: Fix Released → Confirmed
Revision history for this message
Paweł Stołowski (stolowski) wrote :

We got a few reports about this problem over last couple of months but so far haven't been able to reproduce it (I tried several times in a VM with network traffic shaping, simulating slow network with packet losses) nor pinpoint a problem. It is under investigation but very elusive, any extra info may help.

If anyone hits this issue, then the following information may help:
- `snap changes` and `snap change <X>` output
- journalctl -u snapd output (ideally, running snapd with SNAPD_DEBUG=1 environment variable set when the problem occured).
- `systemctl status snapd` output
- information on the quality of network in use (stability, whether it drops frequently etc.)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers