AWS m6g.* instances fail with latest juju

Bug #1872670 reported by Simon Fels
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth

Bug Description

Running latest version of the juju snap from edge (2.8-beta1+develop-fe4e6f2) fails to allocate m6g.* instances on AWS.

$ juju add-machine --constraints 'instance-type=m6g.4xlarge'
$ juju status
Model Controller Cloud/Region Version SLA Timestamp
default aws-controller aws/us-east-1 2.8-beta1 unsupported 10:58:16+02:00

Machine State DNS Inst id Series AZ Message
0 pending pending bionic failed to start machine 0 (chosen architecture arm64 not present in [amd64]), retrying in 10s (10 more attempts)

That said, the m6g instances are not GA yet and are only available for whitelisted preview customers. However we should make sure Juju supports them once GA.

Revision history for this message
Simon Fels (morphis) wrote :

Also tried this with a a1.metal:

$ juju add-machine --constraints 'instance-type=a1.metal'
created machine 1
$ juju status
jModel Controller Cloud/Region Version SLA Timestamp
default aws-controller aws/us-east-1 2.8-beta1 unsupported 11:32:27+02:00

Machine State DNS Inst id Series AZ Message
1 down pending bionic chosen architecture arm64 not present in [amd64]

Revision history for this message
Tim Penhey (thumper) wrote : Re: [Bug 1872670] Re: AWS m6g.* instances fail with latest juju

The edge snap only has jujud binaries for amd64. You'll have to wait for
the official beta1 release for arm64 binaries to be available.

Either that or build a local copy of the juju binary on an arm64 machine,
and bootstrap using that to an arm64 machine.

On Tue, Apr 14, 2020 at 9:40 PM Simon Fels <email address hidden> wrote:

> Also tried this with a a1.metal:
>
> $ juju add-machine --constraints 'instance-type=a1.metal'
> created machine 1
> $ juju status
> jModel Controller Cloud/Region Version SLA Timestamp
> default aws-controller aws/us-east-1 2.8-beta1 unsupported
> 11:32:27+02:00
>
> Machine State DNS Inst id Series AZ Message
> 1 down pending bionic chosen architecture arm64 not
> present in [amd64]
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: Juju bugs
> https://bugs.launchpad.net/bugs/1872670
>
> Title:
> AWS m6g.* instances fail with latest juju
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1872670/+subscriptions
>

Revision history for this message
Ian Booth (wallyworld) wrote :

Targeting to 2.8.1 so we keep track of the bug against a released Juju. Not sure when these new instances will hit GA, but guessing it will be after 2.8 goes out soon.

Changed in juju:
milestone: none → 2.8.1
importance: Undecided → High
status: New → Triaged
Revision history for this message
Harry Pidcock (hpidcock) wrote :

I just tested 2.8/beta snap and we can bootstrap a1 instance types. I don't have an aws account with m6g instances, since its still in preview. But I guess that should work since 2.8/edge was able to create the m6g machine just not copy agent bins because they didn't yet exist in simplestreams.

Revision history for this message
Simon Fels (morphis) wrote :

Thanks Harry!

I have access to the m6g instances and will give this a try with 2.8/edge

Revision history for this message
Gary.Wang (gary-wzl77) wrote :

With juju 2.7.6 installed from latest/stable channel from the snapstore, I can be able to provision a machine on AWS m6g instance type via juju without any problem.

Revision history for this message
Tim Penhey (thumper) wrote :

@Simon, if you want arm64 images, you need a "released" version, so 2.8/candidate, or 2.8/stable when it gets there will be fine, but 2.8/edge won't, as we don't publish the arm64 agent binaries for edge builds.

Revision history for this message
Tim Penhey (thumper) wrote :

Is this actually a problem?

Changed in juju:
status: Triaged → Incomplete
Revision history for this message
Gary.Wang (gary-wzl77) wrote :

I just gave a quick try for one of ap regions which has the m6g instance type supported (E.g. ap-northeast-1) with following two juju versions
  1. latest/stable: 2.8.0
  2. latest/edge 2.8.1+2.8-686e573

This problem persists.
(For eu/us region, it works fine though.)
This issue blocked one of our customers from deploying our product to the AWS ap region.
It would be great this issue can be taken on priority.

Thanks

Ian Booth (wallyworld)
Changed in juju:
status: Incomplete → Triaged
status: Triaged → Confirmed
Revision history for this message
Ian Booth (wallyworld) wrote :

We generate the allowed instance types for each region by parsing the data found at

https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json

The data is filtered so that instance types which are not listed with operating system type "Linux" are excluded. It turns out that for the ap-northeast-1 region, the "m6g.4xlarge" instance type is labelled as "SUSE" and hence filtered out. More investigation needed....

Revision history for this message
Ian Booth (wallyworld) wrote :

We also skip instances where the tenancy metadata is not "Shared" (ie we skip "Dedicated" and "Host"). The m6g.4xlarge instance in ap-northeast-1 is labelled as "Dedicated". It seems interesting that the same instance type is set up differently in different regions and hence available to juju in some and not others.

Ian Booth (wallyworld)
Changed in juju:
assignee: nobody → Ian Booth (wallyworld)
status: Confirmed → In Progress
Revision history for this message
Ian Booth (wallyworld) wrote :

I rewrote from scratch how we get EC2 instance types using APIs that were unavailable 7 or 8 years ago. It fixes a number of issues and should mean any new instance types available to the account associated with the user's credentials should automatically become available without juju changes.

https://github.com/juju/juju/pull/11724

Ian Booth (wallyworld)
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
Gary.Wang (gary-wzl77) wrote :

I got juju snap refreshed from the 2.8/edge track and now have the following version installed
```
installed: 2.8.1+2.8-5620ba4 (12806) 72MB classic
```

After deploying and running `juju status`, the following error occurs.
```
2 pending pending bionic chosen architecture arm64 for image "ami-0ff538b6c69311f5e" not present in [amd64]
```
The machine 2 was supposed to be the m6g instance-type based vm.

Meanwhile, i switched the agent-stream to `proposal` or `devel`, it doesn't help out.
Anything missing during this bug verification?
Thanks.

Revision history for this message
Ian Booth (wallyworld) wrote :

We don't build packaged agents for the edge snap, so there are as yet no arm agent binaries that can be used. We hope to have a 2.81 candidate out next wee, when there will be arm agents packaged and available using agent-stream=proposed

Revision history for this message
Gary.Wang (gary-wzl77) wrote :

Hi Ian
 Is there an available juju version + arm agent binaries that we can use to deploy things to AWS m6g instance type?

Revision history for this message
Ian Booth (wallyworld) wrote :

Sorry about the delay. We're still working to get the 2.8.1 candidate snap out. It's close but it has dragged on a bit as we have worked through the backlog of issues needing to be addressed. What we can look to do is perhaps create a beta snap across all architectures. We'd need to do some enablement work on our Jenkins setup for that to be possible.

Revision history for this message
Gary.Wang (gary-wzl77) wrote :

Thanks, understood.
If you can drop me a message when it's available, that'd be great.

Revision history for this message
Ian Booth (wallyworld) wrote :

Actually, we do build edge snaps for all arches. And juju will use the agent binary from within the snap if there's no packaged binaries found in simplestreams. So, so long as you just want all arm64 machines, you could do this:

(on an arm host)

$ juju bootstrap aws test --constraints 'instance-type=m6g.4xlarge'
Creating Juju controller "test" on aws/us-east-1
Looking for packaged Juju agent version 2.8.1 for arm64
No packaged binary found, preparing local Juju agent binary
Launching controller instance(s) on aws/us-east-1...
 - i-07a3d32183195562a (arch=arm64 mem=64G cores=16)-east-1a"
...
...

# juju add-machine --constraints 'instance-type=m6g.4xlarge'

etc.

I think that will work. The "preparing local Juju agent binary" message above means it's uploading the jujud from the snap.

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.