Bootstrap on Openstack fails if there is an IPv6 subnet

Bug #1761706 reported by thomas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Eric Claude Jones
2.3
Fix Released
High
Eric Claude Jones

Bug Description

Hi, I having this problem in agent installation script, when trying to deploy juju controller on openstack cloud-provider (ovh), on a ubuntu xenial instance.

Install command is :
juju bootstrap ovh-public-cloud ovh-openstack-sbg1 --config image-metadata-url=https://storage.sbg1.cloud.ovh.net/v1/AUTH_f0c04bb34430403982c05c26a9e934b3/simplestreams/images/ --bootstrap-series xenial --show-log --debug

thanks
thomas

Revision history for this message
thomas (toms130) wrote :
description: updated
Revision history for this message
Anastasia (anastasia-macmood) wrote :

This looks like a pnic coming from github.com/juju/juju/network.CalculateOverlaySegment according to the log provided.

I am triaging this as Critical for 2.3.6 (as according to the log, this was a 2.3.5 bootstrap).

At this stage, I am not sure that the panic exists in develop (heading into 2.4-b1).

@thomas (toms130),

Considering that this seems to be coming from networking code, is there something special with your networking setup? I am not sure how to re-produce this at the moment since I am pretty sure we do test a lot of different bootstrap scenarios..

no longer affects: juju-core
Changed in juju:
status: New → Incomplete
Revision history for this message
thomas (toms130) wrote :

I don't think so, I have default ext-net that I use for other instances.

openstack network list

+--------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ID | Name | Subnets |
+--------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| 3f4e3b19-4a46-4672-aade-5654d1fc0704 | Ext-Net | 1b0dae3a-4146-4b81-b38a-17d4e5b30f2c, bdc559c9-8f89-4e56-a895-133f60b0262f, d6f02615-65e9-4921-8943-8aa72a744a16, e9e8eec1-5c91-40ec-b471-11a96a151b76 |
+--------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------+

Revision history for this message
Nicholas Skaggs (nskaggs) wrote :

Is it an Ipv6 only network?

Revision history for this message
thomas (toms130) wrote :

No, this is a ipv4 ipv6 network.
See below openstack server list of generated instance before rollback

os server list
+--------------------------------------+--------------------------+--------+-----------------------------------------------------+-----------------+-----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+--------------------------+--------+-----------------------------------------------------+-----------------+-----------+
| 72aaee9c-de21-4a8c-a78b-7d20721506bc | juju-4758ec-controller-0 | ACTIVE | Ext-Net=2001:41d0:401:2000::e:80df, 167.114.243.80 | Ubuntu 16.04 | vps-ssd-2 |

Revision history for this message
Tim Penhey (thumper) wrote :

Step 1: stop the panic, and log the expectations.

runtime error: index out of range
goroutine 1 [running]:
main.Main.func1()
 /workspace/src/github.com/juju/juju/cmd/jujud/main.go:203 +0xbc
panic(0x290d860, 0x493a120)
 /snap/go/1473/src/runtime/panic.go:505 +0x229
github.com/juju/juju/network.CalculateOverlaySegment(0xc42077d360, 0x17, 0xc420317e60, 0xc420317e90, 0x49e61e0, 0xc420237f00, 0x42)
 /workspace/src/github.com/juju/juju/network/fan.go:74 +0x3a5
github.com/juju/juju/state.(*State).SaveSubnetsFromProvider(0xc420166480, 0xc42058c600, 0x7, 0x8, 0x0, 0x0, 0xc42058c600, 0x7)
 /workspace/src/github.com/juju/juju/state/spacesdiscovery.go:124 +0x70d
github.com/juju/juju/state.(*State).ReloadSpaces(0xc420166480, 0x3224180, 0xc420d1e780, 0x0, 0x0)
 /workspace/src/github.com/juju/juju/state/spacesdiscovery.go:55 +0x20e
github.com/juju/juju/agent/agentbootstrap.InitializeState(0x2ebbe81, 0x5, 0x0, 0x0, 0x7f365e531bc0, 0xc4200fa580, 0xc420320f10, 0x0, 0xc420762c00, 0x10, ...)
 /workspace/src/github.com/juju/juju/agent/agentbootstrap/bootstrap.go:226 +0x14f9
main.(*BootstrapCommand).Run.func2(0x7f365e531bc0, 0xc4200fa580, 0xc4200fa580, 0x7f365e531bc0)
 /workspace/src/github.com/juju/juju/cmd/jujud/bootstrap.go:266 +0x42f
github.com/juju/juju/cmd/jujud/agent.(*agentConf).ChangeConfig(0xc4205f1200, 0xc4207d4b00, 0x0, 0x0)
 /workspace/src/github.com/juju/juju/cmd/jujud/agent/agent.go:103 +0xb0
main.(*BootstrapCommand).Run(0xc4205f1230, 0xc4204a2a00, 0x0, 0x0)
 /workspace/src/github.com/juju/juju/cmd/jujud/bootstrap.go:250 +0xbc0
github.com/juju/cmd.(*SuperCommand).Run(0xc4204fc480, 0xc4204a2a00, 0xc4204a2a00, 0x0)
 /workspace/src/github.com/juju/cmd/supercommand.go:456 +0x2c0
github.com/juju/cmd.Main(0x31f6ac0, 0xc4204fc480, 0xc4204a2a00, 0xc42004c090, 0x7, 0x7, 0x0)
 /workspace/src/github.com/juju/cmd/cmd.go:317 +0x266
main.jujuDMain(0xc42004c080, 0x8, 0x8, 0xc4204a2a00, 0x0, 0x0, 0x0)
 /workspace/src/github.com/juju/juju/cmd/jujud/main.go:186 +0x894
main.Main(0xc42004c080, 0x8, 0x8, 0x0)
 /workspace/src/github.com/juju/juju/cmd/jujud/main.go:219 +0x1d9
main.MainWrapper(0xc42004c080, 0x8, 0x8)
 /workspace/src/github.com/juju/juju/cmd/jujud/main.go:194 +0x3f
main.main()
 /workspace/src/github.com/juju/juju/cmd/jujud/main_nix.go:22 +0x45

Revision history for this message
Tim Penhey (thumper) wrote :

Step 2: work around the expectations, either make better decisions or leave fan unconfigured.

Changed in juju:
status: Incomplete → Triaged
importance: Undecided → High
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1761706] Re: agent installation fails

newFanIP := underlayNet.IP.To4()

^- That line, newFanIP == nil because the underlying IP is an IPv6 and thus
doesn't have a v4 representation.
We should do:
if newFanIP == nil {
  continue
}

I do think the user can work around this by specifying
container-networking=provider or some other setting that disables fan. (eg
--model-default container-networking=local)

On Tue, Apr 10, 2018 at 1:48 AM, Nicholas Skaggs <
<email address hidden>> wrote:

> ** Changed in: juju/2.3
> Assignee: (unassigned) => Eric Claude Jones (ecjones)
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1761706
>
> Title:
> agent installation fails
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1761706/+subscriptions
>

Revision history for this message
Nicholas Skaggs (nskaggs) wrote : Re: agent installation fails

thomas, it would be useful if you can confirm the workaround John mentioned in comment #8 does indeed unblock you. Try

--model-default container-networking=local

as he mentions.

Revision history for this message
thomas (toms130) wrote :
Download full text (5.0 KiB)

Hi, I tried with model-default config, but it still fails with the same error...

juju bootstrap ovh-public-cloud ovh-openstack-sbg1 --config image-metadata-url=https://storage.sbg1.cloud.ovh.net/v1/AUTH_f0c04bb34430403982c05c26a9e934b3/simplestreams/images/ --bootstrap-series xenial --model-default container-networking-method=local --show-log --debug

...

2018-04-11 14:41:26 DEBUG juju.state spacesdiscovery.go:50 environ does not support space discovery, falling back to subnet discovery
2018-04-11 14:41:28 DEBUG juju.worker runner.go:223 killing runner 0xc42067cb60
2018-04-11 14:41:28 INFO juju.worker runner.go:313 runner is dying
2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "presence"
2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "pingbatcher"
2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "leadership"
2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "singular"
2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "txnlog"
2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "txnlog", err: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "txnlog" done: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing "txnlog" from known workers
2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "presence", err: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "presence" done: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing "presence" from known workers
2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "leadership", err: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "leadership" done: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing "leadership" from known workers
2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "pingbatcher", err: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "pingbatcher" done: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing "pingbatcher" from known workers
2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "singular", err: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "singular" done: <nil>
2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing "singular" from known workers
2018-04-11 14:41:28 DEBUG juju.state open.go:306 closed state without error
2018-04-11 14:41:28 DEBUG juju.cmd.jujud asm_amd64.s:574 jujud complete, code 0, err <nil>
2018-04-11 14:41:28 CRITICAL juju.cmd.jujud main.go:204 Unhandled panic:
runtime error: index out of range
goroutine 1 [running]:
main.Main.func1()
 /workspace/src/github.com/juju/juju/cmd/jujud/main.go:203 +0xbc
panic(0x290d860, 0x493a120)
 /snap/go/1473/src/runtime/panic.go:505 +0x229
github.com/juju/juju/network.CalculateOverlaySegment(0xc42061f480, 0x17, 0xc420582a80, 0xc420582ab0, 0x49e61e0, 0xc4204f0300, 0x42)
 /workspace/src/github.com/juju/juju/network/fan.go:74 +0x3a5
github.com/juju/juju/state.(*State).SaveSubnetsFromProvider(0xc420489d40, 0xc420465200, 0x4, 0x4, 0x0, 0x0, 0xc420465200, 0x4)
 /workspace/src/github.com/juju/juju/state/spacesdiscovery.go:124 +0x70d
github.com/juju/juju/state.(*State).Reload...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1761706] Re: agent installation fails
Download full text (6.6 KiB)

So the code has:
    fans, err := cfg.FanConfig()
    if err != nil {
        return errors.Trace(err)
    }
    if len(fans) == 0 {
        return nil
    }

So it should only be trying to CalculateOverlaySegment if FanConfig is not
empty.

I know we do some amount of autodetection of whether we *could* run the
fan. Can you make sure that fan-config is set to "" ?

What I also don't understand is that the CalculateOverlaySegment is also
doing:

if underlaySize <= subnetSize && fan.Underlay.Contains(underlayNet.IP) {
...
  newFanIP := underlayNet.IP.To4()

I don't quite see how fan.Underlay would end up saying that its CIDR
"Contains" an IPv6 address.

I know we do some amount of "autodetect what a possible fan config could
be", but I haven't found that code yet. It's possible we implemented
something on Openstack that somehow automatically generates fan config that
includes IPv6 addresses, when we know that it never should.

On Wed, Apr 11, 2018 at 6:41 PM, thomas <email address hidden> wrote:

> Hi, I tried with model-default config, but it still fails with the same
> error...
>
> juju bootstrap ovh-public-cloud ovh-openstack-sbg1 --config image-
> metadata-
> url=https://storage.sbg1.cloud.ovh.net/v1/AUTH_
> f0c04bb34430403982c05c26a9e934b3/simplestreams/images/
> --bootstrap-series xenial --model-default container-networking-
> method=local --show-log --debug
>
> ...
>
> 2018-04-11 14:41:26 DEBUG juju.state spacesdiscovery.go:50 environ does
> not support space discovery, falling back to subnet discovery
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:223 killing runner
> 0xc42067cb60
> 2018-04-11 14:41:28 INFO juju.worker runner.go:313 runner is dying
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "presence"
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "pingbatcher"
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "leadership"
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "singular"
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:456 killing "txnlog"
> 2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "txnlog", err:
> <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "txnlog" done: <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing
> "txnlog" from known workers
> 2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "presence",
> err: <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "presence" done: <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing
> "presence" from known workers
> 2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "leadership",
> err: <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "leadership" done:
> <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing
> "leadership" from known workers
> 2018-04-11 14:41:28 INFO juju.worker runner.go:483 stopped "pingbatcher",
> err: <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:332 "pingbatcher" done:
> <nil>
> 2018-04-11 14:41:28 DEBUG juju.worker runner.go:395 no restart, removing
> "pingbatcher" from known workers
> 2018-04-11 14:41:28 INFO juju.worker runner....

Read more...

Revision history for this message
thomas (toms130) wrote : Re: agent installation fails

Hi john, I tried with adding --model-default fan-config=" to be sure, but I have similar result.

John A Meinel (jameinel)
Changed in juju:
assignee: nobody → Eric Claude Jones (ecjones)
Revision history for this message
Eric Claude Jones (ecjones) wrote :
Changed in juju:
milestone: none → 2.4-beta1
status: Triaged → Fix Committed
John A Meinel (jameinel)
summary: - agent installation fails
+ Bootstrap on Openstack fails if there is an IPv6 subnet
Revision history for this message
John A Meinel (jameinel) wrote :
Revision history for this message
John A Meinel (jameinel) wrote :
Revision history for this message
John A Meinel (jameinel) wrote :
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.