MAAS temporarily reporting no instance while bringing up a new instance.

Bug #1773117 reported by MichaelEino
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Incomplete
Low
Unassigned
MAAS
Expired
Undecided
Unassigned

Bug Description

#1731108 might be related!

Just performed a bootstrap to a KVMs on MAAS v2.3.3
added a new model "deployment"
installed postgresql-HA
then returned to controller model and enable-ha I got this many times:

machine-0: 10:46:03 ERROR juju.worker.dependency "machiner" manifold worker returned unexpected error: cannot update observed network config: cannot get network interfaces of "4s33ky": instance "4s33ky" not found (not found)

u@MAAS:~# juju status
Model Controller Cloud/Region Version SLA
controller jujumaas maaslab 2.3.7 unsupported
Machine State DNS Inst id Series AZ Message
0 started x.x.x.42 4s33ky xenial default Deployed
1 started x.x.x.40 efdqxc xenial default Deployed
2 started x.x.x.41 fw3dw7 xenial default Deployed

u@MAAS:~# juju controllers --refresh
Controller Model User Access Cloud/Region Models Machines HA Version
jujumaas* controller admin superuser maaslab 3 6 3 2.3.7

More log:

machine-0: 10:49:28 ERROR juju.worker.dependency "machiner" manifold worker returned unexpected error: cannot update observed network config: cannot get network interfaces of "4s33ky": instance "4s33ky" not found (not found)
machine-0: 10:49:28 DEBUG juju.worker.dependency stack trace:
cannot get network interfaces of "4s33ky": instance "4s33ky" not found (not found)
github.com/juju/juju/rpc/client.go:149:
github.com/juju/juju/api/apiclient.go:924:
github.com/juju/juju/api/machiner/machine.go:122:
github.com/juju/juju/worker/machiner/machiner.go:169: cannot update observed network config
machine-2: 10:49:29 DEBUG juju.worker.peergrouper controller machines in state: []string{"0", "1", "2"}
machine-0: 10:49:30 DEBUG juju.worker.peergrouper controller machines in state: []string{"0", "1", "2"}
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170114,"type":"Agent","version":2,"request":"GetEntities","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 1.249135ms {"request-id":170114,"response":"'body redacted'"} Agent[""].GetEntities
machine-0: 10:49:31 WARNING juju.cmd.jujud determining kvm support: INFO: Your CPU does not support KVM extensions
KVM acceleration can NOT be used
: exit status 1
no kvm containers possible
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170115,"type":"Provisioner","version":5,"request":"Life","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 1.100905ms {"request-id":170115,"response":"'body redacted'"} Provisioner[""].Life
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170116,"type":"Provisioner","version":5,"request":"SetSupportedContainers","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 61.834742ms {"request-id":170116,"response":"'body redacted'"} Provisioner[""].SetSupportedContainers
machine-0: 10:49:31 DEBUG juju.worker start "0-container-watcher"
machine-0: 10:49:31 INFO juju.worker start "0-container-watcher"
machine-0: 10:49:31 DEBUG juju.worker "0-container-watcher" started
machine-0: 10:49:31 DEBUG juju.cmd.jujud upgrades done, starting worker "0-container-watcher"
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170117,"type":"Machiner","version":1,"request":"Life","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170118,"type":"Provisioner","version":5,"request":"WatchContainers","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 2.5669ms {"request-id":170117,"response":"'body redacted'"} Machiner[""].Life
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 2.609106ms {"request-id":170118,"response":"'body redacted'"} Provisioner[""].WatchContainers
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170119,"type":"Machiner","version":1,"request":"SetProviderNetworkConfig","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170120,"type":"StringsWatcher","version":1,"id":"19876","request":"Next","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.provider.maas opening model "controller".
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 488.675989ms {"request-id":170119,"response":"'body redacted'"} Machiner[""].SetProviderNetworkConfig
machine-0: 10:49:31 DEBUG juju.worker killing runner 0xc4203bb040
machine-0: 10:49:31 INFO juju.worker runner is dying
machine-0: 10:49:31 DEBUG juju.worker killing "0-container-watcher"
machine-0: 10:49:31 DEBUG juju.cmd.jujud stopping so killing worker "0-container-watcher"
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170121,"type":"StringsWatcher","version":1,"id":"19876","request":"Stop","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 352.256µs {"request-id":170121,"response":"'body redacted'"} StringsWatcher["19876"].Stop
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 490.115139ms {"request-id":170120,"error":"watcher has been stopped","error-code":"stopped","response":"'body redacted'"} StringsWatcher["19876"].Next
machine-0: 10:49:31 INFO juju.worker stopped "0-container-watcher", err: <nil>
machine-0: 10:49:31 DEBUG juju.worker "0-container-watcher" done: <nil>
machine-0: 10:49:31 DEBUG juju.worker no restart, removing "0-container-watcher" from known workers
machine-0: 10:49:31 DEBUG juju.worker.dependency "unconverted-api-workers" manifold worker stopped: setting controller network config: cannot set controller provider network config: cannot get network interfaces of "4s33ky": instance "4s33ky" not found
machine-0: 10:49:31 ERROR juju.worker.dependency "unconverted-api-workers" manifold worker returned unexpected error: setting controller network config: cannot set controller provider network config: cannot get network interfaces of "4s33ky": instance "4s33ky" not found
machine-0: 10:49:31 DEBUG juju.worker.dependency stack trace:
cannot get network interfaces of "4s33ky": instance "4s33ky" not found
github.com/juju/juju/cmd/jujud/agent/machine.go:793: cannot set controller provider network config
github.com/juju/juju/cmd/jujud/agent/machine.go:756: setting controller network config
github.com/juju/juju/worker/fortress/occupy.go:63:
github.com/juju/juju/cmd/jujud/agent/engine/housing.go:93:
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170122,"type":"Agent","version":2,"request":"ModelConfig","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 1.536168ms {"request-id":170122,"response":"'body redacted'"} Agent[""].ModelConfig
machine-0: 10:49:31 DEBUG juju.worker.dependency "machiner" manifold worker started
machine-0: 10:49:31 DEBUG juju.apiserver <- [1] machine-0 {"request-id":170123,"type":"Machiner","version":1,"request":"Life","params":"'params redacted'"}
machine-0: 10:49:31 DEBUG juju.apiserver -> [1] machine-0 1.283401ms {"request-id":170123,"response":"'body redacted'"} Machiner[""].Life
machine-0: 10:49:31 DEBUG juju.network no lxc bridge addresses to filter for machine
machine-0: 10:49:31 DEBUG juju.network "lxdbr0" has addresses [fe80::d01b:3cff:fec6:fe50/64 fe80::1/64]
machine-0: 10:49:31 DEBUG juju.network cannot get "virbr0" addresses: route ip+net: no such network interface (ignoring)
machine-0: 10:49:31 DEBUG juju.network including address local-machine:127.0.0.1 for machine
machine-0: 10:49:31 DEBUG juju.network including address local-cloud:x.x.x.42 for machine
machine-0: 10:49:31 DEBUG juju.network including address local-machine:::1 for machine
machine-0: 10:49:31 DEBUG juju.network addresses after filtering: [local-machine:127.0.0.1 local-cloud:x.x.x.42 local-machine:::1]
machine-0: 10:49:31 INFO juju.worker.machiner setting addresses for "machine-0" to [local-machine:127.0.0.1 local-cloud:x.x.x.42 local-machine:::1]

description: updated
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1773117] [NEW] ERROR cannot update observed network config
Download full text (19.1 KiB)

Is this something that eventually worked, or is this something that
completely blocked enable-ha from continuing?
That looks like we're getting a failure asking MAAS about the details of
"4s33ky" which seems very odd given that it is exactly the machine that we
are running on.

I don't see the error trace in here, but it seems like it would be part of:
instId, err := m.InstanceId()
if err != nil {
return nil, errors.Trace(err)
}

interfaceInfos, err := netEnviron.NetworkInterfaces(api.callContext, instId)

which ultimately leads down to:
instances, err := environ.acquiredInstances(ctx, []instance.Id{instId})
if err != nil {
// This path can never trigger on MAAS 2, but MAAS 2 doesn't
// return an error for a machine not found, it just returns
// empty results. The clause below catches that.
if maasErr, ok := errors.Cause(err).(gomaasapi.ServerError); ok &&
maasErr.StatusCode == http.StatusNotFound {
return nil, errors.NotFoundf("instance %q", instId)
}
return nil, errors.Annotatef(err, "getting instance %q", instId)
}
if len(instances) == 0 {
return nil, errors.NotFoundf("instance %q", instId)
}
inst := instances[0]
return inst, nil

...
args := gomaasapi.MachinesArgs{
AgentName: environ.uuid,
SystemIDs: instanceIdsToSystemIDs(ids),
}

...
machines, err := environ.maasController.Machines(args)

And I'm guessing somewhere in there is where we are getting "not found".
(You should be able to see the call stack if you used
logging-config=juju.apiserver.common=TRACE on the controller model, but
there is enough context in those messages that we probably know where the
error is coming from. (most likely MAAS is returning an empty list when we
are probing for the machine, and I don't know why that would be.)

On Thu, May 24, 2018 at 12:54 PM, MichaelEino <email address hidden>
wrote:

> Public bug reported:
>
> #1731108 might be related!
>
> Just performed a bootstrap to a KVMs on MAAS v2.3.3
> added a new model "deployment"
> installed postgresql-HA
> then returned to controller model and enable-ha I got this many times:
>
> machine-0: 10:46:03 ERROR juju.worker.dependency "machiner" manifold
> worker returned unexpected error: cannot update observed network config:
> cannot get network interfaces of "4s33ky": instance "4s33ky" not found
> (not found)
>
> u@MAAS:~# juju status
> Model Controller Cloud/Region Version SLA
> controller jujumaas maaslab 2.3.7 unsupported
> Machine State DNS Inst id Series AZ Message
> 0 started x.x.x.42 4s33ky xenial default Deployed
> 1 started x.x.x.40 efdqxc xenial default Deployed
> 2 started x.x.x.41 fw3dw7 xenial default Deployed
>
> u@MAAS:~# juju controllers --refresh
> Controller Model User Access Cloud/Region Models Machines
> HA Version
> jujumaas* controller admin superuser maaslab 3 6
> 3 2.3.7
>
> More log:
>
> machine-0: 10:49:28 ERROR juju.worker.dependency "machiner" manifold
> worker returned unexpected error: cannot update observed network config:
> cannot get network interfaces of "4s33ky": instance "4s33ky" not found (not
> found)
> m...

Revision history for this message
MichaelEino (michaeleino) wrote : Re: ERROR cannot update observed network config
Download full text (77.1 KiB)

Hi John.. thanks for you analysis, the error didn't block the enable-ha from continuing.. the action successfully finished, and the controller is in HA mode with no issues.

I have enabled the logging conf as you described

Here is some samples:

machine-1: 19:04:09 DEBUG juju.worker.logger reconfiguring logging from "<root>=DEBUG;unit=DEBUG" to "juju.apiserver.common=TRACE;unit=DEBUG"
machine-0: 19:04:12 TRACE juju.apiserver.common registered unnamed resource: 81618
machine-0: 19:04:12 TRACE juju.apiserver.common.networkingcommon observed network config of machine "0": [{DeviceIndex:1 MACAddress: CIDR:127.0.0.0/8 MTU:65536 ProviderId: ProviderSubnetId: ProviderSpaceId: ProviderAddressId: ProviderVLANId: VLANTag:0 InterfaceName:lo ParentInterfaceName: InterfaceType:loopback Disabled:false NoAutoStart:false ConfigType:loopback Address:127.0.0.1 DNSServers:[] DNSSearchDomains:[] GatewayAddress: Routes:[] IsDefaultGateway:false} {DeviceIndex:1 MACAddress: CIDR:::1/128 MTU:65536 ProviderId: ProviderSubnetId: ProviderSpaceId: ProviderAddressId: ProviderVLANId: VLANTag:0 InterfaceName:lo ParentInterfaceName: InterfaceType:loopback Disabled:false NoAutoStart:false ConfigType:loopback Address:::1 DNSServers:[] DNSSearchDomains:[] GatewayAddress: Routes:[] IsDefaultGateway:false} {DeviceIndex:2 MACAddress:52:54:00:64:83:6c CIDR:x.x.x.0/24 MTU:1500 ProviderId: ProviderSubnetId: ProviderSpaceId: ProviderAddressId: ProviderVLANId: VLANTag:0 InterfaceName:ens3 ParentInterfaceName: InterfaceType:ethernet Disabled:false NoAutoStart:false ConfigType:static Address:x.x.x.42 DNSServers:[] DNSSearchDomains:[] GatewayAddress:x.x.x.251 Routes:[] IsDefaultGateway:true} {DeviceIndex:2 MACAddress:52:54:00:64:83:6c CIDR: MTU:1500 ProviderId: ProviderSubnetId: ProviderSpaceId: ProviderAddressId: ProviderVLANId: VLANTag:0 InterfaceName:ens3 ParentInterfaceName: InterfaceType:ethernet Disabled:false NoAutoStart:false ConfigType:manual Address: DNSServers:[] DNSSearchDomains:[] GatewayAddress:x.x.x.251 Routes:[] IsDefaultGateway:true} {DeviceIndex:3 MACAddress:d2:1b:3c:c6:fe:50 CIDR: MTU:1500 ProviderId: ProviderSubnetId: ProviderSpaceId: ProviderAddressId: ProviderVLANId: VLANTag:0 InterfaceName:lxdbr0 ParentInterfaceName: InterfaceType:bridge Disabled:false NoAutoStart:false ConfigType:manual Address: DNSServers:[] DNSSearchDomains:[] GatewayAddress: Routes:[] IsDefaultGateway:false} {DeviceIndex:3 MACAddress:d2:1b:3c:c6:fe:50 CIDR: MTU:1500 ProviderId: ProviderSubnetId: ProviderSpaceId: ProviderAddressId: ProviderVLANId: VLANTag:0 InterfaceName:lxdbr0 ParentInterfaceName: InterfaceType:bridge Disabled:false NoAutoStart:false ConfigType:manual Address: DNSServers:[] DNSSearchDomains:[] GatewayAddress: Routes:[] IsDefaultGateway:false}]
machine-0: 19:04:12 WARNING juju.cmd.jujud determining kvm support: INFO: Your CPU does not support KVM extensions
KVM acceleration can NOT be used
: exit status 1
no kvm containers possible
machine-0: 19:04:12 TRACE juju.apiserver.common.networkingcommon SetProviderNetworkConfig {Entities:[{Tag:machine-0}]}
machine-0: 19:04:12 TRACE juju.apiserver.common registered unnamed resource: 81619
machine-0: 19:04:12 TRACE juju.apiserver.com...

Revision history for this message
John A Meinel (jameinel) wrote :

So the stacktrace ends up being:
machine-0: 19:04:28 TRACE juju.apiserver.common server RPC error [
{github.com/juju/juju/apiserver/common/networkingcommon/networkconfigapi.go:192: }
{github.com/juju/juju/apiserver/common/networkingcommon/networkconfigapi.go:78: cannot get network interfaces of "4s33ky"}
{github.com/juju/juju/provider/maas/interfaces.go:330: }
{github.com/juju/juju/provider/maas/environ.go:1756: instance "4s33ky" not found}
]

and that line is specifically:
 if len(instances) == 0 {
  return nil, errors.NotFoundf("instance %q", instId)
 }

So somehow:
func (environ *maasEnviron) getInstance(instId instance.Id) (instance.Instance, error) {
 instances, err := environ.acquiredInstances([]instance.Id{instId})

with that instance id, really is returning no instances.

John A Meinel (jameinel)
Changed in juju:
importance: Undecided → Medium
status: New → Triaged
summary: - ERROR cannot update observed network config
+ MAAS temporarily reporting no instance while bringing up a new instance.
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Looking at the comments of this bug, it would look as if juju is expecting a machine with a specific ID, but we have not visibility or background information from MAAS. As such, can you clarify a few things:

Since you are mentioning #1731108, was this a MAAS upgrade from 1.9 to 2.3 ?
What are the machines that you have in MAAS? e.g. maas 'user' machines read
Are there any failures in MAAS? Can you attach the logs? /var/log/maas/*.log

Thanks!

Changed in maas:
status: New → Incomplete
Revision history for this message
Adam Collard (adam-collard) wrote :

This bug has not seen any activity in the last 6 months, so it is being automatically closed.

If you are still experiencing this issue, please feel free to re-open.

MAAS Team

Changed in maas:
status: Incomplete → Expired
Changed in juju:
status: Triaged → Incomplete
Ian Booth (wallyworld)
tags: removed: maas
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 2 years, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Medium → Low
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.