[2.5, enhancement] cloud-init reports failures if local mirror is behind what is included in a MAAS image

Bug #1791185 reported by Mike Pontillo
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Medium
Unassigned

Bug Description

Since MAAS images update so frequently, I often hit intermittent failures due to hash mismatches (because the apt list cached in the image is more recent than the apt list on my local mirror).

I have found two ways to work around this:

(1) Temporarily switch back to the official Ubuntu archive

(2) Add the following snippet to the cloud-config:

bootcmd:
  - [ 'sh', '-c', 'rm -rf /var/lib/apt/lists/*' ]

It might be nice if a workaround could be incorporated into MAAS, or cloud-init itself.

This problem also affects LXD images when used with a cloud-config that specifies additional apt packages to be installed.

description: updated
summary: - [2.5] cloud-init reports failures if local mirror is behind MAAS images
+ [2.5] cloud-init reports failures if local mirror is behind what is
+ included in a MAAS image
description: updated
summary: - [2.5] cloud-init reports failures if local mirror is behind what is
- included in a MAAS image
+ [2.5, enhancement] cloud-init reports failures if local mirror is behind
+ what is included in a MAAS image
Revision history for this message
Lee Trager (ltrager) wrote :

images.maas.io gets its images from cloud-images.ubuntu.com/daily so this effects all cloud-image users, not just MAAS. I think we should have a larger discussion with foundations as to whether the image should include the apt list cache.

Also instead of removing the current apt cache we could force apt-get update to be run on first boot[1]. We need to be careful not to do this in the ephemeral environment as in some situations the ephemeral environment can be used without using apt at all currently.

[1] https://cloudinit.readthedocs.io/en/latest/topics/examples.html?highlight=apt#update-apt-database-on-first-boot

Revision history for this message
Andres Rodriguez (andreserl) wrote : Re: [Bug 1791185] [NEW] [2.5, enhancement] cloud-init reports failures if local mirror is behind what is included in a MAAS image
Download full text (3.9 KiB)

I think that if users are using a local
Mirror and, at the same time are updating their images constantly, they
should ensure the local mirror updated as frequently as the images.

That said:

http://www.chiark.greenend.org.uk/~cjwatson/blog/no-more-hash-sum-mismatch-errors.html

https://bugs.launchpad.net/ubuntu/+source/apt/+bug/948461

Yo me, the fact thatAPT complains about this is actually accurate because
it’s highlighting an issue between archive and images.

On Thu, Sep 6, 2018 at 6:50 PM Mike Pontillo <email address hidden>
wrote:

> Public bug reported:
>
> Since MAAS images update so frequently, I often hit intermittent
> failures due to hash mismatches (because the apt list cached in the
> image is more recent than the apt list on my local mirror).
>
> I have found two ways to work around this:
>
> (1) Temporarily switch back to the official Ubuntu archive
>
> (2) Add the following snippet to the cloud-config:
>
> bootcmd:
> - [ 'sh', '-c', 'rm -rf /var/lib/apt/lists/*' ]
>
> It might be nice if a workaround could be incorporated into MAAS, or
> cloud-init itself.
>
> This problem also affects LXD images when used with a cloud-config that
> specifies additional apt packages to be installed.
>
> ** Affects: maas
> Importance: Medium
> Status: Triaged
>
> ** Description changed:
>
> Since MAAS images update so frequently, I often hit intermittent
> failures due to hash mismatches (because the apt list cached in the
> image is more recent than the apt list on my local mirror).
>
> I have found two ways to work around this:
>
> (1) Temporarily switch back to the official Ubuntu archive
>
> (2) Add the following snippet to the cloud-config:
>
> bootcmd:
> - - [ 'sh', '-c', 'rm -rf /var/lib/apt/lists/*' ]
> + - [ 'sh', '-c', 'rm -rf /var/lib/apt/lists/*' ]
> +
> + It might be nice if a workaround could be incorporated into MAAS, or
> + cloud-init itself.
>
> ** Summary changed:
>
> - [2.5] cloud-init reports failures if local mirror is behind MAAS images
> + [2.5] cloud-init reports failures if local mirror is behind what is
> included in a MAAS image
>
> ** Description changed:
>
> Since MAAS images update so frequently, I often hit intermittent
> failures due to hash mismatches (because the apt list cached in the
> image is more recent than the apt list on my local mirror).
>
> I have found two ways to work around this:
>
> (1) Temporarily switch back to the official Ubuntu archive
>
> (2) Add the following snippet to the cloud-config:
>
> bootcmd:
> - [ 'sh', '-c', 'rm -rf /var/lib/apt/lists/*' ]
>
> It might be nice if a workaround could be incorporated into MAAS, or
> cloud-init itself.
> +
> + This problem also affects LXD images when used with a cloud-config that
> + specifies additional apt packages to be installed.
>
> ** Summary changed:
>
> - [2.5] cloud-init reports failures if local mirror is behind what is
> included in a MAAS image
> + [2.5, enhancement] cloud-init reports failures if local mirror is behind
> what is included in a MAAS image
>
> --
> You received this bug notification because you are subscribed to MAAS.
> https://bugs.launchpad.net/bugs/179118...

Read more...

Revision history for this message
Mike Pontillo (mpontillo) wrote :

I actually testing forcing an apt-get update with cloud-init, but doesn't solve the issue (presumably because apt doesn't overwrite the more current lists on-disk with the less recent files on the mirror).

Revision history for this message
Mike Pontillo (mpontillo) wrote :

I agree with the concept that a mirror should not be more than a few days out-of-date, and if so,it should be a red flag. I disagree with the concept that it's okay for that condition to cause intermittent errors elsewhere in the stack, such as in MAAS or LXD. (In other words, this should be a warning, not an error.)

If MAAS could monitor the configured mirror's -updates pocket and warn if it's out-of-date compared to the official archive, that would be a great feature in my mind. (but that's a separate discussion)

I personally have a cron job that updates both my MAAS image mirror and my Ubuntu mirror every hour. However, I still see this issue from time to time despite that. Reasons for this include:

 - I sometimes mirror from another mirror (geographically closer to me), since it's faster than using the main archive. That mirror might itself be behind the main archive.

 - Since rsyncing the entire Ubuntu archive can take a long time, the cron job for updating the archive might take much longer than the cron job for updating the MAAS images. If network conditions are such that the delay is significant, that's a race condition that can also cause this issue.

 - If I want to test with images on the official maas.io streams instead of my locally mirrored images, I might hit this problem - even if I'm diligent about keeping my image mirror in sync with my Ubuntu archive mirror.

Revision history for this message
Scott Moser (smoser) wrote :

can you run 'cloud-init collect-logs' and attach output?

cloud-init has a feature where it renames files in /var/lib/apt/lists from "archive.ubuntu.com" to "selected-mirror.tld. That could be the source of your pain, in that it is essentially making apt think that the data it got from archive.ubuntu.com is on selected-mirror.tld.

it does that because if you were delievered these 50+M of data in the image, then its proably better for everyone if we re-use it rather than just wasting them. But if selected-mirror.tld is older than archive.ubuntu.com it could have caused the problem.

Now, if you're using using dns to fake a mirror of archive.ubuntu.com, then you're somewhat getting what you deserve.

That said... I believe that we have started making images without anything in /var/lib/apt/lists. I've argued against that, as for a stable release pocket files (the largest set of them) are static for the life of the release.... so deleting them from the image means you're just going to cause someone to get them on first 'apt-get install'.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

@smoser, sure I'll try to gather that the next time I hit this. Though as I said, it's an intermittent problem in my environment.

On DNS poisoning to fake the IP address of 'archive.ubuntu.com': that used to be my strategy, actually. ;-) (That strategy can be really convenient because it makes everything work by default out-of-the-box, with the exception of issues like this one.) But I recently stopped doing that; I had been using cloud-init to specify the IP address of my mirror. So I think your theory about the renamed lists causing the issue is correct.

On your last point, isn't the archive updated frequently enough that "apt install" might fail without updating first, regardless? So it's difficult to actually avoid that initial "apt-get update", though I admit that having the cached files in the image might save a little bit of bandwidth in the case where they're the same.

Revision history for this message
Adam Collard (adam-collard) wrote :

This bug has not seen any activity in the last 6 months, so it is being automatically closed.

If you are still experiencing this issue, please feel free to re-open.

MAAS Team

Changed in maas:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.