juju upgrade connection shutdown unknown series

Bug #1517632 reported by Wayne Witzel III
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Won't Fix
High
Unassigned
juju-core
Won't Fix
Critical
Unassigned
1.25
Won't Fix
Critical
Unassigned

Bug Description

As seen in
    http://reports.vapour.ws/releases/issue/558c2d4b749a562bca8600d6

Juju 1.x fails to upgrade when it sees a series it does not know about. The client and or the state-server are checking distro-info-data. The connection is shutdown when the streams have an agent that is not listed in distro-info-data.

Reproduce:

ON a host that does not know about the current juju devel series (zesty). Attempt an upgrade of a state-server.

juju upgrade-juju --version 1.20.10
available tools:
vailable tools:
    1.25.10-centos7-amd64
    1.25.10-precise-amd64
    1.25.10-precise-arm64
    1.25.10-precise-ppc64el
    1.25.10-precise-s390x
    1.25.10-trusty-amd64
    1.25.10-trusty-arm64
    1.25.10-trusty-ppc64el
    1.25.10-trusty-s390x
    1.25.10-win10-amd64
    1.25.10-win2012-amd64
    1.25.10-win2012hv-amd64
    1.25.10-win2012hvr2-amd64
    1.25.10-win2012r2-amd64
    1.25.10-win7-amd64
    1.25.10-win8-amd64
    1.25.10-win81-amd64
    1.25.10-xenial-amd64
    1.25.10-xenial-arm64
    1.25.10-xenial-ppc64el
    1.25.10-xenial-s390x
    1.25.10-yakkety-amd64
    1.25.10-yakkety-arm64
    1.25.10-yakkety-ppc64el
    1.25.10-yakkety-s390x
ERROR connection is shut down

The log files show nothing after the sync and upload, just the shut down of the connection.

WORK AROUND:
On the client's host run
sudo apt-get install distro-info-data

Changed in juju-core:
milestone: none → 1.26-alpha2
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Ian Booth (wallyworld) wrote :

Some experimentation has shown the following.

$ juju --version
1.25.1-wily-amd64

Start out with agent-stream=debug in environments.yaml

$ juju bootstrap --upload-tools
$ juju upgrade-juju --version 1.26-alpha1

The above works.

However, if we bootstrap without the agent-stream set to debug, and then after bootstrap:

$ juju set-env agent-version=debug
$ juju upgrade-juju --version 1.26-alpha1
ERROR cmd supercommand.go:448 no matching tools available

Looking at the server side logs, we see simplestreams data is being read from
"https://streams.canonical.com/juju/tools/streams/v1/index2.sjson"

but there's a claim that information is missing:

machine-0: 2015-11-19 05:31:16 DEBUG juju.environs.simplestreams simplestreams.go:429 read metadata index at "https://streams.canonical.com/juju/tools/streams/v1/index2.sjson"
machine-0: 2015-11-19 05:31:16 DEBUG juju.environs.simplestreams simplestreams.go:433 skipping index "https://streams.canonical.com/juju/tools/streams/v1/index2.sjson" because of missing information: "content-download" data not found

Examining the json data received by Juju shows that the content-download field is not missing and the data is as expected.

Further investigation required.

Revision history for this message
Ian Booth (wallyworld) wrote :

AH, ignore me, I mistyped the stream name. It should have been "devel" and when I do that everything works as expected.

ie
bootstrap 1.25 with upload tools
upgrade to 1.26-alpha1 without upload tools

Revision history for this message
Ian Booth (wallyworld) wrote :

Looking at upgrade-juju, it queries the agent to get a list of available tools and then proceeds to choose. If the --version is specified when invoking upgrade, that specific version is searched.

The list of tools returned by the agent appears incorrect because:
1. it only includes tools for 1.21 and 1.22
2. it includes an entry for tools with version 0.0.0 and series "" and arch "" (obviously empty)

Previous logging on a custom built 1.25 agent does show that the metadata from simplestreams is incorrect. See https://bugs.launchpad.net/juju-core/+bug/1507867/comments/26

Revision history for this message
Ian Booth (wallyworld) wrote :

Since we were seeing Connection Closed when using upload-tools as a workaround for not being able to download tools from simplestreams, I attempted to get the upgrade working via a one-off backdoor workaround. Steps:

- create a custom juju client which uploads tools without the incremented build number
- this creates a state cached copy of tools which look like an officially downloaded version
- create a custom juju client which patches the result of a FindTools() API call
- this works around the issue where the server is not reading the correct simplestreams data

The net result of the above is that we can go:

juju upgrade-juju --version 1.26-alpha1

and it will act as if the official tools are found (no upload-tools).

Sadly, it gets to the same point as when upload-tools is used - a version of tools that Juju is happy with is found, the tools that will be used are logged to stdout, but then a connection closed error occurs before the upgrade can be initiated.

Changed in juju-core:
milestone: 1.26-alpha2 → 1.26-beta1
Changed in juju-core:
assignee: nobody → Dave Cheney (dave-cheney)
Changed in juju-core:
assignee: Dave Cheney (dave-cheney) → nobody
Changed in juju-core:
assignee: nobody → Cheryl Jennings (cherylj)
Revision history for this message
Cheryl Jennings (cherylj) wrote :

I can't recreate using the steps above, nor in the case of 1.22.8 (with upload-tools) -> 1.26-alpha1. Will need to get with Wayne / Ian to get logs from a recreate.

Changed in juju-core:
assignee: Cheryl Jennings (cherylj) → Dave Cheney (dave-cheney)
Revision history for this message
Dave Cheney (dave-cheney) wrote :

I believe I have reproduced the issue, but at the same time, it does not appear to be a bug.

lucky(~/src/github.com/juju/juju) % juju set-env agent-stream=devel
WARNING key "agent-stream" is not defined in the current environment configuration: possible misspelling
lucky(~/src/github.com/juju/juju) % juju upgrade-juju --version 1.26-alpha1
available tools:
    1.26-alpha1-centos7-amd64
    1.26-alpha1-precise-amd64
    1.26-alpha1-precise-i386
    1.26-alpha1-trusty-amd64
    1.26-alpha1-trusty-arm64
    1.26-alpha1-trusty-i386
    1.26-alpha1-trusty-ppc64el
    1.26-alpha1-vivid-amd64
    1.26-alpha1-vivid-arm64
    1.26-alpha1-vivid-i386
    1.26-alpha1-vivid-ppc64el
    1.26-alpha1-wily-amd64
    1.26-alpha1-wily-arm64
    1.26-alpha1-wily-armhf
    1.26-alpha1-wily-i386
    1.26-alpha1-wily-ppc64el
    1.26-alpha1-win2012-amd64
    1.26-alpha1-win2012hv-amd64
    1.26-alpha1-win2012hvr2-amd64
    1.26-alpha1-win2012r2-amd64
    1.26-alpha1-win7-amd64
    1.26-alpha1-win8-amd64
    1.26-alpha1-win81-amd64
best version:
    1.26-alpha1

Revision history for this message
Dave Cheney (dave-cheney) wrote :

Just seems to get worse with master

lucky(~/src/github.com/juju/juju) % juju set-env agent-stream=devel
2015/11/25 13:11:46 warning: discarding cookies in invalid format (error: json: cannot unmarshal object into Go value of type []cookiejar.entry)
2015/11/25 13:11:47 warning: discarding cookies in invalid format (error: json: cannot unmarshal object into Go value of type []cookiejar.entry)

Revision history for this message
Dave Cheney (dave-cheney) wrote :

Cannot reproduce the bug. Despite various warnings upgrade worked as expected

lucky(~/src/github.com/juju/juju) % juju status
environment: ap-southeast-2
machines:
  "0":
    agent-state: started
    agent-version: 1.26-alpha1
    dns-name: 54.253.14.19
    instance-id: i-9d563442
    instance-state: running
    series: precise
    hardware: arch=amd64 cpu-cores=1 cpu-power=300 mem=3840M root-disk=8192M availability-zone=ap-southeast-2a
    state-server-member-status: has-vote
services: {}

Changed in juju-core:
assignee: Dave Cheney (dave-cheney) → nobody
Revision history for this message
Cheryl Jennings (cherylj) wrote :

Jill is working on testing an upgrade to either 1.25.1 or 1.26-alpha1 in their environment. We'll need her logs if she runs into this issue again.

Changed in juju-core:
assignee: nobody → Cheryl Jennings (cherylj)
Revision history for this message
Jill Rouleau (jillrouleau) wrote :
Revision history for this message
Jill Rouleau (jillrouleau) wrote :
Revision history for this message
Jill Rouleau (jillrouleau) wrote :

Attached logs for upgrade attempt following https://github.com/wwitzel3/juju-upgrade-hack/blob/master/README.md
Data tried: https://pastebin.canonical.com/144844/
Result was a mongo stack trace, so it could be my edited js on this one?

Revision history for this message
Jill Rouleau (jillrouleau) wrote :

Cleaned up setting.js for above process worked for the most part, however out of 128 agents, I have 10 still on 1.22.8 and 59 with errors.
Logs for affected units/machines at https://private-fileshare.canonical.com/~jillr/logs-2015-11-25.tgz

Changed in juju-core:
milestone: 1.26-beta1 → 2.0-alpha2
Revision history for this message
Cheryl Jennings (cherylj) wrote :

I am working with the test team to try and debug this issue, as they are now seeing it in CI tests.

Changed in juju-core:
importance: High → Critical
Curtis Hovey (sinzui)
tags: added: upgrade-juju
summary: - juju upgrade-juju after upload-tools fails
+ juju upgrade-juju can fail if host has outdated distro-info
summary: - juju upgrade-juju can fail if host has outdated distro-info
+ juju upgrade-juju can fail if client has outdated distro-info
Revision history for this message
Cheryl Jennings (cherylj) wrote : Re: juju upgrade-juju can fail if client has outdated distro-info

This problem occurs when the distro-info on the client running the upgrade-juju command is backlevel compared with that of the state server. The state server will send back a list of tools that includes series that the client doesn't know about.

This will cause the unmarshal code to fail in the rpc layer, which causes the connection to shutdown. At this point, the client cannot request the upgrade as the connection has died.

This problem manifests in two ways:
1 - When uploading tools with upgrade-juju, you will see the connection is shut down error.
2 - When running upgrade-juju without upload-tools, the upgrade will fail with "no matching tools available", even if there are matching tools available in the agent-stream.

The workaround for this issue is to update the distro-info on the client running the upgrade command (the distro-info-data package)

Changed in juju-core:
importance: Critical → High
Changed in juju-core:
milestone: 2.0-alpha2 → 2.0-alpha3
Changed in juju-core:
milestone: 2.0-alpha3 → 2.0-beta4
Changed in juju-core:
milestone: 2.0-beta4 → 2.0.1
Curtis Hovey (sinzui)
Changed in juju-core:
assignee: Cheryl Jennings (cherylj) → nobody
affects: juju-core → juju
Changed in juju:
milestone: 2.0.1 → none
milestone: none → 2.0.1
Changed in juju-core:
importance: Undecided → Critical
status: New → Won't Fix
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.0.1 → none
Curtis Hovey (sinzui)
summary: - juju upgrade-juju can fail if client has outdated distro-info
+ upgrade-juju connection shutdown unknown series
summary: - upgrade-juju connection shutdown unknown series
+ juju upgrade connection shutdown unknown series
Curtis Hovey (sinzui)
description: updated
Curtis Hovey (sinzui)
tags: added: ci regression
description: updated
Revision history for this message
Heather Lanigan (hmlanigan) wrote :

This hasn't been seen in many years. Work has been done around juju and distro info in the interim. Please file a new bug if seen again.

Changed in juju:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.