apt-daily timer runs at random hours of the day

Bug #1615482 reported by John Morton on 2016-08-22
110
This bug affects 18 people
Affects Status Importance Assigned to Milestone
apt (Ubuntu)
High
Canonical Foundations Team
Xenial
High
Unassigned
Yakkety
High
Unassigned
Zesty
High
Unassigned

Bug Description

apt, from 1.2.10 onwards (ie any version in Xenial, onwards) uses a systemd timer instead of a cron.daily job. This is a good thing, decoupling apt daily runs from the rest of cron, and ensuring other cron.daily jobs are not blocked by up to half an hour by the default settings of unattended-upgrades.

However the policy chosen is to have the apt daily script run at a random hour of the day in a wrong headed attempt to reduce server load. This has the side effect of running unattended-upgrades at random hours of the day — such as business hours — rather than being confined to between 6:25am and 6:55am, using the defaults.

A better policy would be to have the script activate at 6:00am plus an interval of 20 minutes at one second intervals reducing the impact of timezone population spikes, while still allowing unattended-upgrades to run within a predictable interval, before 7am.

At the very least, some sort of note in the NEWS file detailing the new behaviour would be welcome.

Julian Andres Klode (juliank) wrote :

Right, I understand this argument. We should think about this a bit more.

Changed in apt (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Marius Gedminas (mgedmin) wrote :

Fun story: unattended-upgrades upgraded docker-engine in the middle of a day, breaking gitlab CI builds until a sysadmin figured out what happened and rebooted the machine manually.

unattended-upgrades has an option to postpone unattended reboots to some safer time (like 2AM). Perhaps a similar option to postpone unattended-upgrades themselves would help with this? It's probably fine to run other apt-daily activities (like running apt update or apt upgrade --download-only) during the day.

tags: added: xenial
tags: added: rls-z-incoming
John Morton (john-morton-z) wrote :

Separating out u-u from the rest of the apt maintenance script could be useful, but I prefer running the update, package fetch, upgrade then cache clean up all in one hit. You *can* delay your reboot time, but the shutdown command, at least pre-systemd would run until the shutdown happens, periodically issuing a message to all terminal users before stopping further logins and finally changing the runlevel to reboot the machine. Less than ideal in the old cron.daily days, as those jobs run sequentially and in lexical order. And a is for apt.

In the mixed jessie, trusty, xenial environment we run, we've moved to taking the apt job out of cron.daily entirely, running the reboot 'now', and using the cron/timer start time to effectively schedule the reboot.

My grievance is mainly that in making the transition to systemd the upstream developers chose to:

1) Significantly change the job start time behaviour
2) Not bother to mention this behaviour change in the the changelog

It would be good to see Ubuntu at least set the timer defaults in such a way that enabling unattended upgrades doesn't result in nasty reboot during the day default behaviour. Which, given the half hour random start time default, should be achievable without putting undue load on the servers.

Marius Gedminas (mgedmin) wrote :

My attempt to restrict unattended reboots to times between 0 and 6 AM wasn't successful. What I did is I created a /etc/systemd/system/apt-daily.timer.d/override.conf with the following contents:

    [Timer]
    OnCalendar=*-*-* 00:00
    RandomizedDelaySec=6h

unattended-upgrade ran and rebooted my server at 1:27 AM, at which point journalctl shows

    Nov 10 01:27:36 fridge systemd[1]: apt-daily.timer: Adding 4h 13min 49.738746s random time.
    Nov 10 01:27:36 fridge systemd[1]: Started Daily apt activities.

which would lead me to expect another run at 6:40 AM (which is already outside my desired hours, but okay, not too bad). Instead, what I got was

    Nov 10 01:29:02 fridge systemd[1]: apt-daily.timer: Adding 3h 12min 35.192149s random time.

and then

    Nov 10 09:13:06 fridge systemd[1]: Starting Daily apt activities...

which rudely interrupted my morning Mutt session with another reboot.

Bug in systemd's timers?

I came to the conclusion that to manually control unattended upgrades it currently the "easiest" (sarcasm tag on) way to only let the timer update your package list and manually run unattended-upgrades via cron at your desired time.

To do so:

# apt-get install unattended-upgrades update-notifier-common

# rm /etc/apt/apt.conf.d/20auto-upgrades /etc/apt/apt.conf.d/10periodic
# rm /var/log/unattended-upgrades/*

# vi /etc/apt/apt.conf.d/20auto-upgrades

APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "0";

# vi /etc/apt/apt.conf.d/local

Dpkg::Options {
   "--force-confdef";
   "--force-confold";
}

# vi /etc/apt/apt.conf.d/50unattended-upgrades

(Thanks to ansible-role at https://github.com/jnv/ansible-role-unattended-upgrades)

Ubuntu:
#######
// Unattended-Upgrade::Origins-Pattern controls which packages are
// upgraded.
Unattended-Upgrade::Origins-Pattern {
      "origin=Ubuntu,archive=${distro_codename}-security";
      //"o=Ubuntu,a=${distro_codename}";
      //"o=Ubuntu,a=${distro_codename}-updates";
      //"o=Ubuntu,a=${distro_codename}-proposed-updates";
  };

// List of packages to not update (regexp are supported)
Unattended-Upgrade::Package-Blacklist {
};

// Do automatic removal of new unused dependencies after the upgrade
// (equivalent to apt-get autoremove)
Unattended-Upgrade::Remove-Unused-Dependencies "true";

// Automatically reboot *WITHOUT CONFIRMATION* if a
// the file /var/run/reboot-required is found after the upgrade
//Unattended-Upgrade::Automatic-Reboot "true";

// Use apt bandwidth limit feature, this example limits the download
// speed to 70kb/sec
//Acquire::http::Dl-Limit "70";
Acquire::http::Dl-Limit "350";

Debian:
#######

// Unattended-Upgrade::Origins-Pattern controls which packages are
// upgraded.
Unattended-Upgrade::Origins-Pattern {
      "origin=Debian,codename=${distro_codename},label=Debian-Security";
      //"o=Debian,codename=${distro_codename},label=Debian";
      //"o=Debian,codename=${distro_codename},a=proposed-updates";
  };

// List of packages to not update (regexp are supported)
Unattended-Upgrade::Package-Blacklist {
};

// Do automatic removal of new unused dependencies after the upgrade
// (equivalent to apt-get autoremove)
Unattended-Upgrade::Remove-Unused-Dependencies "true";

// Automatically reboot *WITHOUT CONFIRMATION* if a
// the file /var/run/reboot-required is found after the upgrade
//Unattended-Upgrade::Automatic-Reboot "true";

// Use apt bandwidth limit feature, this example limits the download
// speed to 70kb/sec
//Acquire::http::Dl-Limit "70";
Acquire::http::Dl-Limit "350";

# vi /opt/unattended-upgrade-manual.sh

#!/bin/bash
sleep $((RANDOM \% 1800))
apt-get update
unattended-upgrade -d
apt-get -y clean

# chmod +x /opt/unattended-upgrade-manual.sh

# vi /etc/cron.d/unattended-upgrade

SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
30 03 * * * root /opt/unattended-upgrade-manual.sh

Fuck the systemd-timers, fuck cron.daily - I'm in charge... :P

Improvements are welcome.

Best regards
Florian

Steve Langasek (vorlon) wrote :

Marking this 'high' and a regression. This had been deliberately engineered in Ubuntu to run once per day outside of normal work hours, with a random offset <1h to spread out the mirror load. The current behavior is not consistent with that design, and can be disruptive to the regular use of the system. (I experienced this today with my desktop going into a tailspin because my browser and apt were fighting for memory.)

Changed in apt (Ubuntu):
importance: Medium → High
Achim Spangler (achim-spangler) wrote :

According to https://ubuntuforums.org/showthread.php?t=2330407&page=2 I am not the only user, where the package list is never updated due to the default timer setting in /etc/systemd/system/timers.target.wants/apt-daily.timer . This way, the Taskbar-GUI App never displays any available package updates, as long as you don't call "sudo apt-get update" manually.

This problem is caused by using the computer only for short durations (e.g. notebook), so that the calculated random wait time is longer than the average usage time per day (i.e. "apt-get update" is never called).

My personal solution was to use a fixed wait time after system boot of about 5 minutes (see attachement), and use "After=network.target" to make sure, that the network is ready for successfull call of "apt-get update" (see attachement).

Maybe the system could get enhanced to derive the average uptime per session / day, so that the apt-daily timer gets restricted to this average time interval.

Or the system could detect, whether the computer is used as a continually running system, or whether the computer is each time only running for some hours. In the latter case, at least the package list should get updated a short time after system start / boot.

My system is running Kubuntu 6.04 LTS _Xenial.

Tony Garcia (tonyskapunk-rax) wrote :

I was able to override it using /etc/systemd/system/apt-daily.timer.d/override.conf with this config:

###

[Timer]
OnCalendar=
OnCalendar=*-*-* 02:00
RandomizedDelaySec=4h
AccuracySec=1m
Persistent=true

###

Which Means:

        OnCalendar | Any day *-*-* at 02:00hrs. NOTE: See the first empty OnCalendar, this was needed otherwise the time defined in here was added from the previous schedule instead of replaced. (https://github.com/systemd/systemd/issues/3233)
RandomizedDelaySec | Add a random delay of up to 4:00hrs [02:00-06:00]
       AccuracySec | Start the process at any time within 1m of the calculated time (oncalendar + randomizeddleaysec) this results in [02:00-06:01]

---

Default is:

###

[Timer]
OnCalendar=*-*-* 6,18:00
RandomizedDelaySec=12h
AccuracySec=1h

###

Which means:
        OnCalendar | Any day *-*-* at 06:00hrs and 18:00hrs
RandomizedDelaySec | Add a random delay of up to 12:00hrs [06:00-18:00, 18:00-06:00] (NOTE: at this point it could be any time!)
       AccuracySec | Start the process at any time within 1h of the calculated time (oncalendar + randomizeddleaysec) this results in [06:00-19:00, 18:00-07:00]

---

Here I tested the overridden configuration:

# systemctl list-timers | grep -P "NEXT|apt"
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2017-04-21 05:46:11 CDT 17h left Thu 2017-04-20 11:42:55 CDT 3min 29s ago apt-daily.timer apt-daily.service
# systemctl restart apt-daily.timer && systemctl list-timers | grep -P "NEXT|apt"
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2017-04-21 02:58:14 CDT 15h left Thu 2017-04-20 11:42:55 CDT 3min 44s ago apt-daily.timer apt-daily.service
# systemctl restart apt-daily.timer && systemctl list-timers | grep -P "NEXT|apt"
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2017-04-21 04:05:12 CDT 16h left Thu 2017-04-20 11:42:55 CDT 3min 53s ago apt-daily.timer apt-daily.service
# systemctl restart apt-daily.timer && systemctl list-timers | grep -P "NEXT|apt"
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2017-04-21 05:44:41 CDT 17h left Thu 2017-04-20 11:42:55 CDT 4min 0s ago apt-daily.timer apt-daily.service
# systemctl restart apt-daily.timer && systemctl list-timers | grep -P "NEXT|apt"
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2017-04-21 03:05:18 CDT 15h left Thu 2017-04-20 11:42:55 CDT 4min 1s ago apt-daily.timer apt-daily.service
# systemctl restart apt-daily.timer && systemctl list-timers | grep -P "NEXT|apt"
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2017-04-21 03:48:18 CDT 16h left Thu 2017-04-20 11:42:55 CDT 4min 4s ago apt-daily.timer apt-daily.service

Steve Langasek (vorlon) on 2017-04-21
Changed in apt (Ubuntu):
milestone: none → ubuntu-17.10
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Julian Andres Klode (juliank) wrote :

I guess we should go to something like

OnCalendar=*-*-* 6,18:00
RandomizedDelaySec=1h
AccuracySec=1h

or tie down random and accuracy even more. We could do like 15 min or so - the cron job used 30 min sleep maximum.

The boot argument with short runtime should be ignored. We can't help everyone. People are already complaining that the job runs during boot and want us to delay it during boot. systemd provides no infrastructure for us to depend on network being available reliably (or well, at all), so that's not optimal either.

Julian Andres Klode (juliank) wrote :

We can of course add After=network.target but this gives no real guarantee that internet is available. For waiting with catch up runs during boot, that's tracked in systemd's bug tracker at
https://github.com/systemd/systemd/issues/5659.

Julian Andres Klode (juliank) wrote :

While we're at it, we could also push the times 2 hours more outside, to 4 and 20. So, something like this would probably be an improvement:

[Unit]
Description=Daily apt activities
After=network.target

[Timer]
OnCalendar=*-*-* 4,20:00
RandomizedDelaySec=30m
AccuracySec=5m
Persistent=true

[Install]
WantedBy=timers.target

On Fri, Apr 21, 2017 at 06:55:45PM -0000, Julian Andres Klode wrote:
> While we're at it, we could also push the times 2 hours more outside, to
> 4 and 20. So, something like this would probably be an improvement:

> [Unit]
> Description=Daily apt activities
> After=network.target
>
> [Timer]
> OnCalendar=*-*-* 4,20:00
> RandomizedDelaySec=30m
> AccuracySec=5m
> Persistent=true

I don't see why this is an "improvement". The designed experience for this
on Ubuntu is for these jobs to run between 6am and 7am local time, with a
single run per day and a random delay within the 1-hour window.

On Fri, Apr 21, 2017 at 06:48:59PM -0000, Julian Andres Klode wrote:

> We can of course add After=network.target but this gives no real guarantee
> that internet is available. For waiting with catch up runs during boot,
> that's tracked in systemd's bug tracker at
> https://github.com/systemd/systemd/issues/5659.

As discussed on IRC, this should be After=network-online.target /
Wants=network-online.target. If we're not online, there's no point in doing
an apt update AFAICS. (Ok, in an extreme case you might have a
sneakernet-connected apt repository which you rotate by hand... but I'm not
sure anyone who uses apt that way cares about the daily timer.)

> The boot argument with short runtime should be ignored. We can't help
> everyone. People are already complaining that the job runs during boot
> and want us to delay it during boot.

I think that should clearly be regarded as 'wontfix'. The default
experience should ensure that security updates are applied in a timely
manner to every instance of Ubuntu, regardless of its power on/off cycle.

Dimitri John Ledkov (xnox) wrote :

I'm not sure I like the idea of 6am..7am local time.

In case of desktops/laptops they might not be on during that time - we also do update on boot right?

Many servers and clouds, do not set local timezone, or set it to UTC explicitely. Ideally we would want to run 6am..7am window based on local or cloud-region timezone. Because 6am..7am UTC is bad timing for Middle East / Central Asia / Far East / Oceania.

Also, I thought caribou was working on applying updates on shutdown, or something like. Or am I confusing things?

Julian Andres Klode (juliank) wrote :

6 to 7 is the standard time the cron job used to run, so it's basically reverting to the pre 16.04 (?) status quo.

Changed in apt (Ubuntu):
status: Triaged → Fix Committed
Haw Loeung (hloeung) wrote :

The original request to have it randomised over a larger window is as per LP bug #1554848. As Dimitri points out, many servers and clouds do not set local timezones so the archive mirrors we run are hit around the same time saturating the upstream links we have (with users then reporting issues with updates).

Haw Loeung (hloeung) wrote :

Also, I want to point out that when the main archive mirrors are under heavy load, it doesn't just affect updates but booting up instances, mostly public clouds, as well since various tools will perform updates on boot.

The attachment "Call apt 5 minutes after system boot for short running computer" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Julian Andres Klode (juliank) wrote :

Of course the load will be higher and peakish again, but the period is twice as long as it was before (it's now 1 hour instead of 30 minutes), so that should help.

I don't really get the cloud argument, though. APT will run updates on boot if the instance was running more than 24 hours ago. And really, are you saying people are starting cloud instances en masse between 6 and 7?

Haw Loeung (hloeung) wrote :

There's usually 3 main windows where the archive servers are under heavy load. EU, UK/UTC, and US east. The UTC 0630 - 0700 window is usually the biggest hit.

For the cloud argument, people run CI jobs spinning up new instances in the cloud. Tools such as cloud-init and juju will normally run apt-get dist-upgrade on boot to ensure instances are up-to-date.

One such example is LP bug #1561173.

Haw Loeung (hloeung) wrote :

I mean, 1hr is better than the 30min window previously. But since you're there poking around that bit of code, think we can increase that further to 2hrs (0500 - 0700)? :D

Also, think we can get the window increase in unattended-upgrades for older distros (Trusty etc.)?

Thanks for your time!

Julian Andres Klode (juliank) wrote :

We can't increase the delay for older distros, as it delays all daily cron jobs there. We could increase to 2 hours, but I don't feel like doing daily APT releases changing time outs, and 1.4.1 with the 1 hour time out is out already.

I mean, I change it to 2 hours now, and tomorrow someone else turns up and either says 2 hours is too much or 2 hours is too low. Or not "5 to 7, I'd rather have 6 to 8." I think at some point we have to stay with a decision.

If 1 hour turns out to be problematic, then we should revisit this when we see an issue IMO.

Harry (harry33) wrote :

OK,
I have now tested this new apt (1.4.1) against the old version (1.4).
My setup is using solely systemd (with no upstart nor cgmanager installed).
I have fast SSD's, so my normal booting time is about 4 seconds.

Now after upgrading to apt 1.4.1 the booting time to GDM increased with about 5 to 8 seconds.
During boot the computer stops just before GDM is starts.
I think systemd now waits till network is OK.

As a workaround, downgrading back to apt 1.4 solves the issue and booting time to GDM is 4 seconds.

This should be fixed and till that this update kept in proposed.

tags: added: block-proposed
Harry (harry33) wrote :

Added the tag: block-proposed.

Julian Andres Klode (juliank) wrote :

Dropping the tag. Boot time can change randomly. Desktop start does *not* depend on apt timer start, and timer now depends on network-online in order to actually work. So yes, it actually does something now, so it is of course slower.

tags: removed: block-proposed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package apt - 1.4.1

---------------
apt (1.4.1) unstable; urgency=medium

  [ Julian Andres Klode ]
  * systemd: Rework timing and add After=network-online (LP: #1615482)
  * debian/rules: Actually invoke dh_clean in override_dh_clean

  [ Unit 193 ]
  * apt-ftparchive: Support '.ddeb' dbgsym packages

 -- Julian Andres Klode <email address hidden> Mon, 24 Apr 2017 18:47:55 +0200

Changed in apt (Ubuntu):
status: Fix Committed → Fix Released
Haw Loeung (hloeung) wrote :

On Tue, Apr 25, 2017 at 08:16:37AM -0000, Julian Andres Klode wrote:
> We can't increase the delay for older distros, as it delays all daily
> cron jobs there. We could increase to 2 hours, but I don't feel like
> doing daily APT releases changing time outs, and 1.4.1 with the 1 hour
> time out is out already.
>

For older distros where apt daily still happens from cron (and not
systemd timers), we could rename /etc/cron.daily/apt to something like
zzapt so it happens at the very end so it doesn't delay all daily cron
jobs.

Then ship out /etc/apt/apt.conf.d/50unattended-upgrades overriding
APT::Periodic::RandomSleep (3600) so it's easier for users to override
if needed.

> I mean, I change it to 2 hours now, and tomorrow someone else turns up
> and either says 2 hours is too much or 2 hours is too low. Or not "5 to
> 7, I'd rather have 6 to 8." I think at some point we have to stay with a
> decision.
>
> If 1 hour turns out to be problematic, then we should revisit this when
> we see an issue IMO.

Okay.

Thanks,

Haw

Changed in apt (Ubuntu Xenial):
importance: Undecided → High
Changed in apt (Ubuntu Yakkety):
importance: Undecided → High
Changed in apt (Ubuntu Zesty):
importance: Undecided → High
Changed in apt (Ubuntu Xenial):
status: New → In Progress
Changed in apt (Ubuntu Yakkety):
status: New → In Progress
Changed in apt (Ubuntu Zesty):
status: New → In Progress

Hello John, or anyone else affected,

Accepted apt into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/apt/1.2.21 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in apt (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
Changed in apt (Ubuntu Yakkety):
status: In Progress → Fix Committed
Chris J Arges (arges) wrote :

Hello John, or anyone else affected,

Accepted apt into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/apt/1.3.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Dimitri John Ledkov (xnox) wrote :

Please note this update will be soon superseded by https://launchpad.net/bugs/1686470

tags: added: regression-proposed
tags: added: verification-failed
tags: removed: verification-needed
Steve Langasek (vorlon) wrote :

I have removed the apt SRU from {xenial,yakkety}-proposed per the preceding comment.

Changed in apt (Ubuntu Xenial):
status: Fix Committed → Won't Fix
Changed in apt (Ubuntu Yakkety):
status: Fix Committed → Won't Fix
Julian Andres Klode (juliank) wrote :

@vorlon Also drop the zesty one from the unapproved queue?

Steve Langasek (vorlon) wrote :

done, thanks!

Changed in apt (Ubuntu Zesty):
status: In Progress → Won't Fix
Łukasz Zemczak (sil2100) wrote :

I see an apt upload in the zesty queue mentioning that it fixes this bug - should it be dropped from the queue? Or is it some proper fix this time?

Julian Andres Klode (juliank) wrote :

The mention is a historic leftover from replacing this one with bug 1686470 - I guess we could just mark this bug as a duplicate of 1686470

Brian Murray (brian-murray) wrote :

Marking it as a duplicate won't actually stop sru-review from processing the bug and the bug will show up with a line through it on the SRU report, so I'll just let the tools modify the bug and it can be set to v-done when bug 1686470 is verified.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.