controller fails to bring up jujud machine

Bug #1871224 reported by Daniel Bidwell
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Medium
Heather Lanigan

Bug Description

'/var/lib/juju/tools/machine-0/jujud' machine --data-dir '/var/lib/juju' --machine-id 0 --debug -v
generates:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xbed89b]

goroutine 185 [running]:
github.com/juju/juju/agent.(*configInternal).MongoInfo(0xc000afc2c0, 0xc000408090, 0x10)
 /workspace/_build/src/github.com/juju/juju/agent/agent.go:848 +0x1fb
github.com/juju/juju/cmd/jujud/agent.openStatePool(0x48c05a0, 0xc000afc2c0, 0x6fc23ac00, 0xdf8475800, 0x0, 0x4011e88, 0xc000408090, 0x0, 0xc0004080a0, 0x0, ...)
 /workspace/_build/src/github.com/juju/juju/cmd/jujud/agent/machine.go:1190 +0x63
github.com/juju/juju/cmd/jujud/agent.(*MachineAgent).initState(0xc0009fe360, 0x48c05a0, 0xc000afc2c0, 0xc0009fe360, 0x48c05a0, 0xc000afc2c0)
 /workspace/_build/src/github.com/juju/juju/cmd/jujud/agent/machine.go:1020 +0x189
github.com/juju/juju/worker/state.Manifold.func1(0x47e2ba0, 0xc000ab9e60, 0x1b, 0xc000463380, 0x1, 0x1)
 /workspace/_build/src/github.com/juju/juju/worker/state/manifold.go:88 +0x1e1
github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency.(*Engine).runWorker.func1(0x0, 0x0, 0x0, 0x0)
 /workspace/_build/src/github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency/engine.go:504 +0x355
github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency.(*Engine).runWorker.func2(0x3adb4e0, 0xc000acefc0)
 /workspace/_build/src/github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency/engine.go:508 +0x62
github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency.(*Engine).runWorker(0xc000368380, 0x3e9511a, 0x5, 0x989680, 0xc0001fd980, 0xc000ab9e60)
 /workspace/_build/src/github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency/engine.go:539 +0x116
created by github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency.(*Engine).requestStart
 /workspace/_build/src/github.com/juju/juju/vendor/gopkg.in/juju/worker.v1/dependency/engine.go:414 +0x50c

The controller is down. It fails for versions, 2.7.2-bionic-amd64, 2.7.0, 2.6.9, 2.6.8. It does not seem to be version specific.

bootstrap-params are:
controller-config:
  api-port: 17070
  api-port-open-delay: 2s
  audit-log-capture-args: false
  audit-log-exclude-methods:
  - ReadOnlyMethods
  audit-log-max-backups: 10
  audit-log-max-size: 300M
  auditing-enabled: true
  ca-cert: |
    -----BEGIN CERTIFICATE-----
    MIIDrTCCApWgAwIBAgIVANJsr/4oM80frVZeSTZ1TSqbMjeIMA0GCSqGSIb3DQEB
    CwUAMG4xDTALBgNVBAoTBGp1anUxLjAsBgNVBAMMJWp1anUtZ2VuZXJhdGVkIENB
    IGZvciBtb2RlbCAianVqdS1jYSIxLTArBgNVBAUTJGYxZTIzNmNkLTYyOTItNDA3
    YS04NWU5LWQxNTJjYzA3YTVhYzAeFw0xOTAyMTgwMjI3MDBaFw0yOTAyMjUwMjI2
    NTlaMG4xDTALBgNVBAoTBGp1anUxLjAsBgNVBAMMJWp1anUtZ2VuZXJhdGVkIENB
    IGZvciBtb2RlbCAianVqdS1jYSIxLTArBgNVBAUTJGYxZTIzNmNkLTYyOTItNDA3
    YS04NWU5LWQxNTJjYzA3YTVhYzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoC
    ggEBAJpXjTerhY3/x9P6HSlo1tWNOZKlyXwvcVLb31FWEzED/RyQxhPYJMd8v6Qu
    akj1Gij/SLQaO9cqsDqx2g7CNcIf7D5npCGZZHpSzFf9yNc+zgPs7SqSCRXC8z9L
    I1gGFLVj8McMfpW9ac6UivPAo0rV31/YSDi7+XPk4oLfbW7kBKeWIRePMKjxMHcP
    WZBfagE5izW7VaNh4jNNW+DKfbFPh2AA5lthvvZ7dLCKmpzdE0V4Gen1ZjkNv52B
    TwVrGJ4+B5pg93xgItVjQLie8bWTXuwLfcOX8dw4VoVZwOdbWmEhIZQaIZLid2Qm
    pmC+Z4dPWHce/JYi6Hir00PYYUsCAwEAAaNCMEAwDgYDVR0PAQH/BAQDAgKkMA8G
    A1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFJGpxuFK2/78FXljV0rkrwofsyn0MA0G
    CSqGSIb3DQEBCwUAA4IBAQBOklvjR50HCNGq216FH77yFLC7EQRE0PLo6+4/fwq4
    5DJntNsicaK2nix0cUKr8CKKPt0LwVeaOWqNiiOTNCpWyGrF7R/Y0+98cLboI8Ot
    rpJgukfj967qBIZQohT2YaZ7SmnQl73LHigtR4/TYbtQHGmT1+hXe1komEz4KQXr
    tH3V2MzUADTQu/T9dEUWzMAS5cLIEwrWjmN8BuYTbEklwKBvDYuk/nmP6rrdOewl
    /OezsL5ULOboHPM7hjfdwougrXV6BbtIiok9WeIfUGmb0aKrhy5o7KK+1KTwG5sM
    j26Z8K6xy923yNVJzmLDcxGN4O1L+I1MMnO0p4ARimby
    -----END CERTIFICATE-----
  charmstore-url: https://api.jujucharms.com/charmstore
  controller-uuid: 2b10c3d9-a7ab-426b-8d88-b53ee35da73d
  max-logs-age: 72h
  max-logs-size: 4096M
  max-prune-txn-batch-size: 1000000
  max-prune-txn-passes: 100
  max-txn-log-size: 10M
  metering-url: https://api.jujucharms.com/omnibus/v3
  prune-txn-query-count: 1000
  prune-txn-sleep-time: 10ms
  set-numa-control-policy: false
  state-port: 37017
controller-model-config:
  agent-metadata-url: ""
  agent-stream: released
  agent-version: 2.5.1
  apt-ftp-proxy: ""
  apt-http-proxy: ""
  apt-https-proxy: ""
  apt-mirror: ""
  apt-no-proxy: ""
  authorized-keys: |
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDEf8cyk73qasDA8NenREv2w7mh8EiTXlpdHUr5UgOwVqWTIonI8GcpETzJGcP+mizZUQh/D1DQn2K5EUkeS0wFrMTWbbFDCtfRF/Wd4HDZ8UNJxxG1X5KcnhvKLeBAnFwSraR0IlEBWgkE/p7P6JdvSujVOqCcRVqoGnZJRCHmtikt1RKsg3ZeJPQXZERh4Zm+iEQobrJJnuFlQxdNL9QXmdVhWEw4QW/Gvf/3Dau/ga9uK/0bJLUptymckl33WzuWvK3tNkAI+xuqFnIcVlDF9lRWD6uapo2ayizqIjIx/de7xfgUN+945s1eUBlFg/oHKMZjIVFcrkd/ZSLU7dkT juju-client-key
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDU8V64uPSRxvs+FS3LmsWjEqEcI8LKHqBIwjTjC0SqAk3rt4j8CA44jUTsILfJkitveaAtTU/qhY3/+u6kRh0KnpvihKoPm2//g5YvDn3rJsexz72ab0fDkw1KmQIJW6rnzgMYG3kkrVivxGfP5nxkS2N3cSX8qkkRnXY/wFNq7AdZEWSN3VZO5jwIIgHEtWYoImsYO2AEmaXS485EQHR2TmBQ3tWS44IyFSANc6nNkJRCSwxVhblPi8aS1z6TLpN0MRmzzOdE+AAm+IR30o3TkjQbtTClVP2QCVypZUJz9LtlTaUGgd9wyZm4ZqaTcW6oUSQS7NMuMN7e5cZo+mcn bidwell@samwise
  automatically-retry-hooks: true
  backup-dir: ""
  cloudinit-userdata: ""
  container-image-metadata-url: ""
  container-image-stream: released
  container-inherit-properties: ""
  container-networking-method: ""
  datastore: lin-os-sansvpn
  default-series: bionic
  development: false
  disable-network-management: false
  egress-subnets: ""
  enable-disk-uuid: true
  enable-os-refresh-update: true
  enable-os-upgrade: true
  external-network: ""
  fan-config: ""
  firewall-mode: instance
  ftp-proxy: ""
  http-proxy: ""
  https-proxy: ""
  ignore-machine-addresses: false
  image-metadata-url: ""
  image-stream: released
  juju-ftp-proxy: ""
  juju-http-proxy: ""
  juju-https-proxy: ""
  juju-no-proxy: 127.0.0.1,localhost,::1
  logforward-enabled: false
  logging-config: <root>=WARNING;unit=DEBUG
  max-action-results-age: 336h
  max-action-results-size: 5G
  max-status-history-age: 336h
  max-status-history-size: 5G
  name: controller
  net-bond-reconfigure-delay: 17
  no-proxy: 127.0.0.1,localhost,::1
  primary-network: Netjuju
  provisioner-harvest-mode: destroyed
  proxy-ssh: false
  resource-tags: {}
  snap-http-proxy: ""
  snap-https-proxy: ""
  snap-store-assertions: ""
  snap-store-proxy: ""
  ssl-hostname-verification: true
  test-mode: false
  transmit-vendor-metrics: true
  type: vsphere
  update-status-hook-interval: 5m
  uuid: b8b79990-5179-44eb-86d1-948768e24bf1
controller-model-version: 1
hosted-model-config:
  datastore: lin-os-sansvpn
  name: default
  primary-network: Netjuju
  uuid: 94d68902-1ccd-4b24-81b7-3e196568a762
bootstrap-machine-instance-id: juju-e24bf1-0
bootstrap-machine-constraints:
  mem: 3584
bootstrap-machine-hardware:
  arch: amd64
  mem: 3584
  rootdisk: 8192
model-constraints: {}
custom-image-metadata: "null"
controller-cloud: |
  name: myvscloud
  type: vsphere
  auth-types: [userpass]
  endpoint: 143.207.48.76
  regions:
    New Datacenter:
      endpoint: 143.207.48.76
controller-cloud-region: New Datacenter
controller-cloud-credential-name: bidwell
controller-cloud-credential:
  auth-type: userpass
...
Is there a way to bootstrap this up again without loosing data and machines?

Revision history for this message
Ian Booth (wallyworld) wrote :

It looks like it's missing api address info in the agent.conf file under /var/lib/juju/agents/machine-0
That I think is set up via userdata at bootstrap. What does the agent.conf file look like?

re: "Is there a way to bootstrap this up again without loosing data and machines?"
Are you saying this is an existing deployment that has gone bad?

Can you provide a little more info about the steps to get to this point?

Revision history for this message
Daniel Bidwell (bidwell) wrote : Re: [Bug 1871224] Re: controller failes to bring up jujud machine
Download full text (12.0 KiB)

You are correct. There is no apiaddresses line.

On Tue, 2020-04-07 at 04:23 +0000, Ian Booth wrote:
> It looks like it's missing api address info in the agent.conf file
> under /var/lib/juju/agents/machine-0
> That I think is set up via userdata at bootstrap. What does the
> agent.conf file look like?

Here is the agent.conf file:

# format 2.0
tag: machine-0
datadir: /var/lib/juju
logdir: /var/log/juju
metricsspooldir: /var/lib/juju/metricspool
nonce: user-admin:bootstrap
jobs:
- JobManageModel
- JobHostUnits
upgradedToVersion: 2.7.2
cacert: |
  -----BEGIN CERTIFICATE-----
  MIIDrTCCApWgAwIBAgIVANJsr/4oM80frVZeSTZ1TSqbMjeIMA0GCSqGSIb3DQEB
  CwUAMG4xDTALBgNVBAoTBGp1anUxLjAsBgNVBAMMJWp1anUtZ2VuZXJhdGVkIENB
  IGZvciBtb2RlbCAianVqdS1jYSIxLTArBgNVBAUTJGYxZTIzNmNkLTYyOTItNDA3
  YS04NWU5LWQxNTJjYzA3YTVhYzAeFw0xOTAyMTgwMjI3MDBaFw0yOTAyMjUwMjI2
  NTlaMG4xDTALBgNVBAoTBGp1anUxLjAsBgNVBAMMJWp1anUtZ2VuZXJhdGVkIENB
  IGZvciBtb2RlbCAianVqdS1jYSIxLTArBgNVBAUTJGYxZTIzNmNkLTYyOTItNDA3
  YS04NWU5LWQxNTJjYzA3YTVhYzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoC
  ggEBAJpXjTerhY3/x9P6HSlo1tWNOZKlyXwvcVLb31FWEzED/RyQxhPYJMd8v6Qu
  akj1Gij/SLQaO9cqsDqx2g7CNcIf7D5npCGZZHpSzFf9yNc+zgPs7SqSCRXC8z9L
  I1gGFLVj8McMfpW9ac6UivPAo0rV31/YSDi7+XPk4oLfbW7kBKeWIRePMKjxMHcP
  WZBfagE5izW7VaNh4jNNW+DKfbFPh2AA5lthvvZ7dLCKmpzdE0V4Gen1ZjkNv52B
  TwVrGJ4+B5pg93xgItVjQLie8bWTXuwLfcOX8dw4VoVZwOdbWmEhIZQaIZLid2Qm
  pmC+Z4dPWHce/JYi6Hir00PYYUsCAwEAAaNCMEAwDgYDVR0PAQH/BAQDAgKkMA8G
  A1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFJGpxuFK2/78FXljV0rkrwofsyn0MA0G
  CSqGSIb3DQEBCwUAA4IBAQBOklvjR50HCNGq216FH77yFLC7EQRE0PLo6+4/fwq4
  5DJntNsicaK2nix0cUKr8CKKPt0LwVeaOWqNiiOTNCpWyGrF7R/Y0+98cLboI8Ot
  rpJgukfj967qBIZQohT2YaZ7SmnQl73LHigtR4/TYbtQHGmT1+hXe1komEz4KQXr
  tH3V2MzUADTQu/T9dEUWzMAS5cLIEwrWjmN8BuYTbEklwKBvDYuk/nmP6rrdOewl
  /OezsL5ULOboHPM7hjfdwougrXV6BbtIiok9WeIfUGmb0aKrhy5o7KK+1KTwG5sM
  j26Z8K6xy923yNVJzmLDcxGN4O1L+I1MMnO0p4ARimby
  -----END CERTIFICATE-----
statepassword: y2pKSQ1UBoewM/e0lnQQ0bNn
controller: controller-2b10c3d9-a7ab-426b-8d88-b53ee35da73d
model: model-b8b79990-5179-44eb-86d1-948768e24bf1
oldpassword: 4d7663b918f7706f1e47a8e045b042a0
loggingconfig: <root>=WARNING;unit=DEBUG
values:
  AGENT_SERVICE_NAME: jujud-machine-0
  CONTAINER_TYPE: ""
  NUMA_CTL_PREFERENCE: "false"
  PROVIDER_TYPE: vsphere
controllercert: |
  -----BEGIN CERTIFICATE-----
  MIIEbzCCA1egAwIBAgIVAPbDl4cXbNltju+GxHc380bUT9HoMA0GCSqGSIb3DQEB
  CwUAMG4xDTALBgNVBAoTBGp1anUxLjAsBgNVBAMMJWp1anUtZ2VuZXJhdGVkIENB
  IGZvciBtb2RlbCAianVqdS1jYSIxLTArBgNVBAUTJGYxZTIzNmNkLTYyOTItNDA3
  YS04NWU5LWQxNTJjYzA3YTVhYzAeFw0yMDAyMjEwMTI1MjBaFw0zMDAyMjgwMTI1
  MjBaMBsxDTALBgNVBAoTBGp1anUxCjAIBgNVBAMMASowggGiMA0GCSqGSIb3DQEB
  AQUAA4IBjwAwggGKAoIBgQCmz5DvDEZ9li+ZJ0+5zGghfyBa3SX1urFy15KaXOGz
  3FPpOrzQy+0RYhJxqryDCJGV+MZlIfvomCUo+HYDwD9xraR1tie4lqGRb+FtRfAg
  IEmA+pCGJbjvovYRt64dUnGIUu83ZYupDGwGFuwtQfxZolB8PuJ+qiupuY57wEbi
  fTDwUt/M/BbmIZ6atxlM+X10myeDtiNIEKTNyCj7sJ5UO22umdsHS8iFXs6r4jQx
  ch4djlUUZW1sgAqHn0hUTN3eXym0DCp0LiuQCX+SEaJK9eZH97qbpcjVRgFNGVJN
  Tr32Jw1SCvEq+gfYPbBaiC4sIm7gX89uCbVRS6FkohCLukkmq09y78eYYznXC21s
  32icKK5kG6yvv+jHgYHkKejvhXjyRObsGIlgJ5syOhqOxLcgV0g7CqTVpNu5ffCe
  wBqTSdhxFs2sCjxirdApuJudwhMuxmsvV2MdfW0ycXK/3UhQ2V69Lya...

Revision history for this message
Ian Booth (wallyworld) wrote : Re: controller failes to bring up jujud machine

I currently have no idea how the api addresses went missing.
Adding them back manually should work for now.
I'll target the bug to a 2.8 milestone so we can try and figure out how to reproduce.

Changed in juju:
milestone: none → 2.8.1
importance: Undecided → Medium
status: New → Triaged
Tim Penhey (thumper)
summary: - controller failes to bring up jujud machine
+ controller fails to bring up jujud machine
Changed in juju:
milestone: 2.8.1 → 2.8-next
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.8-next → 2.9.1
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.9.1 → 2.9.2
Changed in juju:
milestone: 2.9.2 → 2.9.3
Changed in juju:
milestone: 2.9.3 → 2.9.4
Changed in juju:
milestone: 2.9.4 → 2.9.5
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.9.5 → 2.9.6
Changed in juju:
milestone: 2.9.6 → 2.9.7
Changed in juju:
assignee: nobody → Heather Lanigan (hmlanigan)
status: Triaged → In Progress
Revision history for this message
Heather Lanigan (hmlanigan) wrote :

This PR resolves the panic, but not the root cause of this reported issue:
https://github.com/juju/juju/pull/13112

Revision history for this message
Heather Lanigan (hmlanigan) wrote :

Additional logging was added in https://github.com/juju/juju/pull/11854, 1888453 is a duplicate of this bug

Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.