refreshes on pi sometimes fail with certificate errors because time is set to Mar 17

Bug #1931724 reported by Ian Johnson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Triaged
High
Unassigned

Bug Description

When building an arm64 UC20 image for the pi with this set of snaps:

snapd 10709
pi 89
pi-kernel 231
core20 907

using this model:

http://cdimage.ubuntu.com/ubuntu-core/20/dangerous-stable/pending/ubuntu-core-20-arm64+raspi.model-assertion

and overriding the channel for the image to stable with --channel argument to ubuntu-image, the resulting image when booted often fails to download specifically the pi gadget snap for initial refresh like so:

```
Mar 17 21:36:22 ubuntu snapd[610]: store_download.go:169: Cannot download or apply deltas for pi: Get https://api.snapcraft.io/api/v1/snaps/download/YbGa9O3dAXl88YLI6Y1bGG74pwBxZyKg_89_97_xdelta3.delta: x509: certificate has expired or is not yet valid
```

Note that the date says March 17th, this is before the certificate date for the store is valid:

```
$ curl -v https://api.snapcraft.io/v2/snaps/refresh 2>&1 | grep -A 3 "Server certificate"
* Server certificate:
* subject: CN=api.snapcraft.io
* start date: May 3 13:51:43 2021 GMT
* expire date: Aug 1 13:51:43 2021 GMT
```

hence why we see the issue.

The underlying problem here is that for some reason, when the device pivots from the initramfs to userspace, the time is moved forward to March 17th, and then timesyncd takes too long to start/run, and snapd tries to download things first, but is still stuck in the past. If one waits until after timesyncd runs, then re-tries the refresh, it succeeds because time has moved forward all the way to the real present.

My best guess here is that the time is moved forward to March 17th because that's when the new core20 snap's systemd was built, and systemd does this automatically to move time forward to when systemd was built, and the basic flow is this:

1. device boots into run mode the first time with the snaps from the image,
2. timesyncd runs early enough to move time forward before snapd tries to download anything (snapd is also stuck on this boot trying to process seeding, which takes a while)
3. snapd finishes seeding, and then checks for refreshes
4. snapd finds that there are newer revisions available of the core20, snapd, and pi snaps, so snapd tries to download them in that order
5. the download for core20 and snapd works correctly since the time was moved forward by timesyncd
6. the device gets core20 ready, and reboots since core20 is the base snap
7. now on reboot, the device sets it's time to the build time of the core20 snap revision that it refreshed to, which is likely March 17th
8. timesyncd takes its sweet time and doesn't run for whatever reason
9. snapd however starts back up and since it doesn't need to do any seeding, is able to progress quickly through processing the refreshes and moves on to trying to download the pi gadget snap refresh, which fails since timesyncd has not moved the time forward yet

Some journal output from the boot starting in step 7: https://pastebin.canonical.com/p/Bgm4TBsRJx/

I'm trying to reproduce this with full logs starting with the boot in step 1, but it's tricky since we want to use the default gadget snap from the store which does not set journal to be persistent by default

Changed in snapd:
status: New → Triaged
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.