network-get fails to find configs on 2.3.0 that worked on 2.2.6

Bug #1737058 reported by William Grant on 2017-12-08
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju
Critical
Eric Claude Jones
2.3
Critical
Eric Claude Jones

Bug Description

After upgrading one of my controllers from Juju 2.2.6 to 2.3.0 this morning (model still 2.2.6):

  $ juju run --unit=telegraf/0 'network-get --primary-address prometheus-client'
  ERROR no network config found for binding "prometheus-client"

On the production equivalent, which is still fully Juju 2.2.6:

  $ juju run --unit=telegraf/0 'network-get --primary-address prometheus-client'
  10.50.92.21

Ian Booth (wallyworld) on 2017-12-08
Changed in juju:
milestone: none → 2.3.1
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Witold Krecicki (wpk)
Ian Booth (wallyworld) wrote :

wgrant did some digging into their staging server
[13:59:11] <wgrant> Hmmh mmm
[13:59:19] <wgrant> Are subnets new in 2.3?
[13:59:53] <wgrant> Trying to debug this 2.3.0 upgrade network-get thing. New models are fine.
[14:00:08] <wgrant> But the only subnets in the DB are for the new model
[14:06:36] <wgrant> Looking at the code, it does not seem implausible that this is the problem.
[14:06:51] <wgrant> getOneNetworkConfig requires a corresponding subnet to exist to match the IP to the space.
[14:10:22] <wgrant> Oh interesting.
[14:11:26] <wgrant> I have a second model on the prod controller (now 2.2.6) that was created in 2.2.0, and it has subnets, while the controller model and main model (created in 2.0 or 2.1) don't have subnets.
[14:11:48] <wgrant> So I guess subnets were new in 2.2, and 2.3 breaks if they're not there, but no upgrade added them from 2.1 and earlier?

Ian Booth (wallyworld) wrote :

Not sure that the above is entirely relevant here since from what I can see, getOneNetworkConfig() is on v4 of the uniter which I think was obsoleted by v5 or later in 2.2.6

Ian Booth (wallyworld) wrote :

2.2 branch uses uniter v6 which uses NetworkInfo instead of NetworkConfig

William Grant (wgrant) wrote :

I can't find obvious differences in the DB state, so I wonder if this is just protocol incompatibility. Unfortunately that's difficult to test due to 2.3.0 agents not working on a model upgraded from 2.2.6 (bug #1737107).

Changed in juju:
milestone: 2.3.1 → none
Tim Penhey (thumper) on 2017-12-10
Changed in juju:
milestone: none → 2.3.2
William Grant (wgrant) wrote :

This seems to be a protocol incompatibility. After upgrading the controller to 2.3.1, 2.2.6 agents still couldn't network-get. Once those agents were upgraded to 2.3.1, network-get worked fine.

Changed in juju:
assignee: Witold Krecicki (wpk) → Eric Claude Jones (ecjones)
Eric Claude Jones (ecjones) wrote :
Changed in juju:
status: Triaged → In Progress
Changed in juju:
status: In Progress → Fix Committed
Felipe Reyes (freyes) on 2018-01-10
tags: added: sts
Alvaro Uría (aluria) on 2018-01-10
tags: added: canonical-bootstack
removed: sts
Mario Splivalo (mariosplivalo) wrote :

I hit the same issue upgrading my environment from 2.2.5 to 2.3.1.
I have two models, one with openstack on trusty and one with openstack on xenial.
Upgrading the controller model to 2.3.1 rendered all of the units on trusty model to be in error state (xenial model was fine).
Upgrading both models to 2.3.1 resolved all of the issues (xenial was fine before, but trusty one got fixed too).

2.3.2 will fix this so that upgrading 2.2.X => 2.3.2 in the controller
won't break the existing 2.2 models. But yes, the current workaround is to
upgrade all models to 2.3.1. 2.3.2 should be out this week.

On Tue, Jan 16, 2018 at 2:30 PM, Mario Splivalo <
<email address hidden>> wrote:

> I hit the same issue upgrading my environment from 2.2.5 to 2.3.1.
> I have two models, one with openstack on trusty and one with openstack on
> xenial.
> Upgrading the controller model to 2.3.1 rendered all of the units on
> trusty model to be in error state (xenial model was fine).
> Upgrading both models to 2.3.1 resolved all of the issues (xenial was fine
> before, but trusty one got fixed too).
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1737058
>
> Title:
> network-get fails to find configs on 2.3.0 that worked on 2.2.6
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1737058/+subscriptions
>

Changed in juju:
status: Fix Committed → Fix Released
William Grant (wgrant) wrote :

2.3.2, 2.3.3 and 2.3.4 are still broken.

$ juju bootstrap --agent-version=2.2.6 --bootstrap-series=xenial
$ juju deploy cs:mysql
$ juju run --unit=mysql/0 "network-get --primary-address cluster"
10.50.79.120
$ juju upgrade-juju -m controller --agent-version 2.3.2
started upgrade to 2.3.2
$ juju run --unit=mysql/0 "network-get --primary-address cluster"
ERROR no network config found for binding "cluster"

Ian Booth (wallyworld) on 2018-03-15
Changed in juju:
status: Fix Released → Triaged
milestone: 2.3.2 → 2.3.5
Eric Claude Jones (ecjones) wrote :
Changed in juju:
status: Triaged → Fix Committed
Ian Booth (wallyworld) on 2018-03-16
Changed in juju:
milestone: 2.3.5 → 2.4-beta1
John A Meinel (jameinel) wrote :

marking Released in 2.4 because there hasn't been a 2.4 release without this fix, and it is fixed in the branch.

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers