`juju wait-for` panic: runtime error: invalid memory address or nil pointer dereference

Bug #2040554 reported by Simon Déziel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Medium
Jordan Barrett

Bug Description

To wait for a quiet environment, I'm calling `juju wait-for` as:

```
juju wait-for model ci-testing '--query=life=="alive" && status=="available" && len(applications) > 0 && forEach(applications, app => app.status == "active") && len(units) > 0 && forEach(units, unit => unit.workload-status == "active" && unit.agent-status == "idle")'
```

This works most of the time but it caused a panic in this CI run: https://github.com/canonical/charm-lxd/actions/runs/6640019456/job/18040502333 (see "Exercise lxd-https relation"):

```
+ juju remove-application https-client
will remove application https-client
- will remove unit https-client/0
+ juju_wait
+ juju wait-for model ci-testing '--query=life=="alive" && status=="available" && len(applications) > 0 && forEach(applications, app => app.status == "active") && len(units) > 0 && forEach(units, unit => unit.workload-status == "active" && unit.agent-status == "idle")'
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" found, waiting...
model "ci-testing" is running
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x138 pc=0x494e71e]

goroutine 1 [running]:
github.com/juju/juju/cmd/juju/waitfor.UnitScope.GetIdentValue({{0xc000b05110, 0xc000b05140}, 0x0, 0xc000b05320}, {0xc0003d40dd, 0xc})
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/juju/waitfor/unit.go:259 +0x5fe
github.com/juju/juju/cmd/juju/waitfor.outputModelSummary({0x6443d40?, 0xc00010e010}, {0xc000b05110?, 0xc000b05140?}, 0xc00026e140, 0xc000b052f0, 0xc000b05350, 0xc000b05320)
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/juju/waitfor/model.go:465 +0xbd8
github.com/juju/juju/cmd/juju/waitfor.(*modelCommand).Run.func1()
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/juju/waitfor/model.go:155 +0x1d6
github.com/juju/juju/cmd/juju/waitfor.(*modelCommand).Run(0xc000b02480, 0xc000119740)
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/juju/waitfor/model.go:174 +0x455
github.com/juju/juju/cmd/modelcmd.(*modelCommandWrapper).Run(0xc000b04270, 0xc000acf898?)
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/modelcmd/modelcommand.go:663 +0x123
github.com/juju/juju/cmd/modelcmd.(*baseCommandWrapper).Run(0xc0003c0f80, 0x6?)
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/modelcmd/base.go:554 +0xaf
github.com/juju/cmd/v3.(*SuperCommand).Run(0xc0001902c0, 0xc000119740)
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/vendor/github.com/juju/cmd/v3/supercommand.go:514 +0x378
github.com/juju/cmd/v3.(*SuperCommand).Run(0xc000780dc0, 0xc000119740)
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/vendor/github.com/juju/cmd/v3/supercommand.go:514 +0x378
github.com/juju/cmd/v3.Main({0x64a83e8, 0xc000780dc0}, 0xc000119740, {0xc00033c7c0, 0x4, 0x4})
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/vendor/github.com/juju/cmd/v3/cmd.go:468 +0x25d
github.com/juju/juju/cmd/juju/commands.jujuMain.Run({0xc00011c140?}, {0xc000100050, 0x5, 0x5})
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/juju/commands/main.go:213 +0x97f
github.com/juju/juju/cmd/juju/commands.Main(...)
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/juju/commands/main.go:126
main.main()
 /build/snapcraft-juju-da55d99d296df0fff85c9f11a6a5ab0b/parts/juju/build/cmd/juju/main.go:27 +0x72
+ true
+ juju exec --unit lxd/leader -- lxc config trust list --format csv
```

This was with Juju's snap 3.1.6 and the status prior to the panic:

```
+ juju status --relations
Model Controller Cloud/Region Version SLA Timestamp
ci-testing local localhost/localhost 3.1.6 unsupported 13:14:02Z

App Version Status Scale Charm Channel Rev Exposed Message
https-client active 1 https-client 0 no
lxd active 1 lxd 0 no

Unit Workload Agent Machine Public address Ports Message
https-client/0* active idle 1 10.173.35.2
lxd/0* active idle 0 10.173.35.236 8443/tcp

Machine State Address Inst id Base AZ Message
0 started 10.173.35.236 juju-b3f4b1-0 ubuntu@20.04 Running
1 started 10.173.35.2 juju-b3f4b1-1 ubuntu@22.04 Running
```

Revision history for this message
Joseph Phillips (manadart) wrote :

This can happen due to AllWatcher backing state, where the status is not known.

It's the line with: `m.UnitInfo.AgentStatus.Current`

I think we just need to handle a nil value and return "unknown" or some such.

Changed in juju:
status: New → Triaged
importance: Undecided → High
milestone: none → 3.1.7
assignee: nobody → Jack Shaw (jack-shaw)
Changed in juju:
milestone: 3.1.7 → 3.1.8
Harry Pidcock (hpidcock)
Changed in juju:
milestone: 3.1.8 → 3.3.3
Ian Booth (wallyworld)
Changed in juju:
milestone: 3.3.3 → 3.3.4
Changed in juju:
milestone: 3.3.4 → 3.3.5
Changed in juju:
milestone: 3.3.5 → 3.3.6
Changed in juju:
milestone: 3.3.6 → 3.4.4
Harry Pidcock (hpidcock)
Changed in juju:
assignee: Jack Shaw (jack-shaw) → Jordan Barrett (barrettj12)
Harry Pidcock (hpidcock)
Changed in juju:
status: Triaged → In Progress
Revision history for this message
Jordan Barrett (barrettj12) wrote :

https://github.com/juju/juju/pull/17474 will improve the error message so it's not as scary for users.

The underlying cause of the bug is still unknown, so we'll leave this bug open until we can work out why the UnitInfo is sometimes nil.

In any case, the AllWatcher will be removed soon.

Changed in juju:
status: In Progress → Triaged
importance: High → Medium
assignee: Jordan Barrett (barrettj12) → nobody
Changed in juju:
milestone: 3.4.4 → 3.4.5
Revision history for this message
Harry Pidcock (hpidcock) wrote :

If you encounter "internal error: UnitInfo is missing", this is the same bug.

Changed in juju:
milestone: 3.4.5 → 3.4.4
status: Triaged → Fix Committed
assignee: nobody → Jordan Barrett (barrettj12)
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.