nil pointer dereference in 'relation-set'

Bug #1882127 reported by John A Meinel
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Critical
Achilleas Anagnostopoulos
2.7
Fix Released
Critical
Achilleas Anagnostopoulos

Bug Description

Reported on IRC

2020-06-04 17:43:32 DEBUG worker.uniter.jujuc server.go:204 running hook tool "relation-set"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x1edbb43]

goroutine 450 [running]:
github.com/juju/juju/worker/uniter/runner/jujuc.(*RelationSetCommand).Run(0xc0001ead90, 0xc000726e10, 0x0, 0x0)
        /workspace/_build/src/github.com/juju/juju/worker/uniter/runner/jujuc/relation-set.go:166 +0x183
github.com/juju/juju/worker/uniter/runner/jujuc.(*cmdWrapper).Run(0xc000351380, 0xc000726e10, 0xc000726e10, 0x0)
        /workspace/_build/src/github.com/juju/juju/worker/uniter/runner/jujuc/server.go:307 +0x3d
github.com/juju/juju/vendor/github.com/juju/cmd.Main(0x3725680, 0xc000351380, 0xc000726e10, 0xc000a4da00, 0x4, 0x4, 0x2)
        /workspace/_build/src/github.com/juju/juju/vendor/github.com/juju/cmd/cmd.go:379 +0x2c3
github.com/juju/juju/worker/uniter/runner/jujuc.(*Jujuc).Main(0xc0001e93e0, 0xc00089a080, 0x3d, 0xc00089a0c0, 0x34, 0xc0009203c0, 0xc, 0xc000a4da00, 0x4, 0x4, ...)
        /workspace/_build/src/github.com/juju/juju/worker/uniter/runner/jujuc/server.go:207 +0x7cb
reflect.Value.call(0xc000806a20, 0xc0008e6260, 0x13, 0x31220fd, 0x4, 0xc000b43f18, 0x3, 0x3, 0xc00070aa80, 0xc000920240, ...)
        /snap/go/5365/src/reflect/value.go:447 +0x461
reflect.Value.Call(0xc000806a20, 0xc0008e6260, 0x13, 0xc000b49f18, 0x3, 0x3, 0x30, 0x2e7b020, 0xc0003dce01)
        /snap/go/5365/src/reflect/value.go:308 +0xa4
net/rpc.(*service).call(0xc0007cee40, 0xc000373c70, 0xc000920228, 0xc000920240, 0xc000106680, 0xc00050cf00, 0x2ec1d00, 0xc0006b5f80, 0x199, 0x2a51be0, ...)
        /snap/go/5365/src/net/rpc/server.go:384 +0x14e
created by net/rpc.(*Server).ServeCodec
        /snap/go/5365/src/net/rpc/server.go:481 +0x42b

This is during something like

def send_connection(self, port, rest_port, host=None):
        # Expose common settings via app relation data from a leader unit.
        if self.model.unit.is_leader():
            app_data = self._relation.data[self.model.app]
            app_data['port'] = str(port)
            app_data['rest_port'] = str(rest_port)
            app_data['host'] = host

Even if we didn't do the leader check, we should be getting an error, not a panic.

Revision history for this message
Facundo Batista (facundo) wrote :

David later changed the code to do str() also con host, but still the same issue.

I asked him to repr() those three values after str(), he got:

2020-06-04 18:10:03 INFO juju-log zookeeper:1: '2181'
2020-06-04 18:10:03 INFO juju-log zookeeper:1: '2181'
2020-06-04 18:10:03 INFO juju-log zookeeper:1: '10.152.183.129'

Which looks fine...

Revision history for this message
John A Meinel (jameinel) wrote :

This was seen in Juju 2.7.6

Ian Booth (wallyworld)
Changed in juju:
milestone: none → 2.8.1
assignee: nobody → Achilleas Anagnostopoulos (achilleasa)
status: Triaged → Fix Committed
Revision history for this message
Ian Booth (wallyworld) wrote :

The committed fix correctly reports any error that occurs instead of ignoring it and getting a nil pointer. If there's an underlying error, that will now be reported so the fix doesn't guarantee that the reported scenario will work, just that the error will be reported.

Revision history for this message
John A Meinel (jameinel) wrote :

Do we have a PR?
I'm guessing it is this one: https://github.com/juju/juju/pull/11634

But I'd like confirmation.

Revision history for this message
Achilleas Anagnostopoulos (achilleasa) wrote :

@jameinel We did some refactoring work to fix https://bugs.launchpad.net/juju/+bug/1854348 for 2.7 in the following PR: https://github.com/juju/juju/pull/11234 which I believe should have already taken care of the issue. I think this is the one Ian is referring to? Note that the PR you linked targets 2.8 (2.7 does not contain any of the state-in-controller changes) whereas according to the IRC report, this was seen with 2.7.6.

AFAICT Simon was trying to reproduce this with 2.7.6 but was not able to. We have also introduced integration tests for unit/application databag RW operations (https://github.com/juju/juju/blob/develop/tests/suites/relations/relation_data_exchange.sh) which all pass with 2.7.6. I am wondering whether the initial reporter on IRC upgraded their controller but forgot to also upgrade their model.

Revision history for this message
Ian Booth (wallyworld) wrote :

I don't think it was the above PR.
I looked at 2.7.6 and saw there was a shadowed err variable that masked an error getting application settings. This was fixed in tip of 2.7 in a small change which address some lint issues.

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.