[2.2b3] mDNS observer traceback

Bug #1673916 reported by Mark Shuttleworth
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
High
Mike Pontillo
2.1
High
Mike Pontillo

Bug Description

Just saw this in rackd.log:

Traceback (most recent call last):
Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1.
2017-03-18 02:01:21 provisioningserver.utils.services: [info] mDNS observation process started.
2017-03-18 02:01:22 -: [critical] Unhandled Error
 Traceback (most recent call last):
   File "/usr/lib/python3/dist-packages/twisted/python/log.py", line 101, in callWithLogger
     return callWithContext({"system": lp}, func, *args, **kw)
   File "/usr/lib/python3/dist-packages/twisted/python/log.py", line 84, in callWithContext
     return context.call({ILogContext: newCtx}, func, *args, **kw)
   File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 118, in callWithContext
     return self.currentContext().callWithContext(ctx, func, *args, **kw)
   File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 81, in callWithContext
     return func(*args,**kw)
 --- <exception caught here> ---
   File "/usr/lib/python3/dist-packages/twisted/internet/posixbase.py", line 597, in _doReadOrWrite
     why = selectable.doRead()
   File "/usr/lib/python3/dist-packages/twisted/internet/process.py", line 280, in doRead
     return fdesc.readFromFD(self.fd, self.dataReceived)
   File "/usr/lib/python3/dist-packages/twisted/internet/fdesc.py", line 94, in readFromFD
     callback(output)
   File "/usr/lib/python3/dist-packages/twisted/internet/process.py", line 283, in dataReceived
     self.proc.childDataReceived(self.name, data)
   File "/usr/lib/python3/dist-packages/twisted/internet/process.py", line 937, in childDataReceived
     self.proto.childDataReceived(name, data)
   File "/usr/lib/python3/dist-packages/twisted/internet/protocol.py", line 604, in childDataReceived
     self.outReceived(data)
   File "/usr/lib/python3/dist-packages/provisioningserver/utils/services.py", line 79, in outReceived
     self.outLineReceived(line)
   File "/usr/lib/python3/dist-packages/provisioningserver/utils/services.py", line 107, in outLineReceived
     self.objectReceived(obj)
   File "/usr/lib/python3/dist-packages/provisioningserver/utils/services.py", line 110, in objectReceived
     self._callback([obj])
   File "/usr/lib/python3/dist-packages/provisioningserver/rackdservices/networks_monitoring_service.py", line 73, in reportMDNSEntries
     d.addCleanup(lambda client: client(
 builtins.AttributeError: 'Deferred' object has no attribute 'addCleanup'

2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: Traceback (most recent call last):
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/sbin/maas-rack", line 85, in <module>
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: main()
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/sbin/maas-rack", line 81, in main
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: run()
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/sbin/maas-rack", line 52, in run
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: from provisioningserver.__main__ import main
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/lib/python3/dist-packages/provisioningserver/__main__.py", line 44, in <module>
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: main()
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/lib/python3/dist-packages/provisioningserver/utils/script.py", line 97, in __call__
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: self.execute(argv)
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/lib/python3/dist-packages/provisioningserver/utils/script.py", line 92, in execute
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: args.handler.run(args)
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/lib/python3/dist-packages/provisioningserver/utils/avahi.py", line 213, in run
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: _observe_mdns(reader, output, args.verbose)
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: File "/usr/lib/python3/dist-packages/provisioningserver/utils/avahi.py", line 143, in _observe_mdns
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: print(json.dumps(event), file=output, flush=True)
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: BrokenPipeError: [Errno 32] Broken pipe
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: Exception ignored in: <_io.TextIOWrapper name='<stdout>' encoding='UTF-8'>
2017-03-18 02:01:24 provisioningserver.utils.services: [info] observe-mdns: BrokenPipeError: [Errno 32] Broken pipe
2017-03-18 02:01:25 provisioningserver.utils.services: [critical] mDNS observation process failed.

Traceback (most recent call last):
Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1.

Related branches

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Thanks for the bug report. The traceback is helpful, but doesn't make the root cause obvious. What happens if you do the following on the rack controller where this occurred?

    maas-rack observe-mdns

It *should* print out a list of JSON objects, one per line, such as:

    {"hostname": "fearsome-fawn", "interface": "eth0", "address": "192.168.122.5"}

If it does anything other than that, let's take a step back and see what happens if you run:

    /usr/bin/avahi-browse --all --resolve --no-db-lookup --parsable --no-fail

Thanks in advance.

Changed in maas:
status: New → Incomplete
Revision history for this message
Thiago Martins (martinx) wrote :

I'm also seeing this error message in my fresh installed MaaS Next.

-
root@maas-1:~# maas-rack observe-mdns
{"interface": "ens3", "hostname": "NASF6A00D", "address": "10.250.0.10"}
{"interface": "ens3", "hostname": "NASFABDE1", "address": "10.250.0.11"}
....
-

Here is the rackd.log with the error:

https://paste.ubuntu.com/24221996/

Revision history for this message
Thiago Martins (martinx) wrote :

BTW, I'm using:

maas 2.2.0~beta3+bzr5815-0ubuntu1~16.04.1

Changed in maas:
status: Incomplete → Confirmed
importance: Undecided → High
status: Confirmed → Triaged
Revision history for this message
Mike Pontillo (mpontillo) wrote :

This appears to be because we're calling addCleanup() instead of addCallback() in the reportMDNSEntries() function in rackdservices/networks_monitoring_service.py.

I still didn't understand why this bug doesn't surface for everyone in the world (which is why I held the bug in Incomplete state). Because I have been testing 3 different MAAS servers and haven't seen the issue. But now that I look closer at the code, I think this is because there is a separate shim for reporting mDNS entries depending on whether or not the observer is running via the rack or the region. The rack-based observer seems to have the bug, while the region observer is bug-free.

Ultimately, I think this bug would most often be seen in environments with standalone rack controllers.

Looks like this got introduced in MAAS 2.2 due to the getClientNow() cleanup. Previous to that change, the code simply did:

        client = self.clientService.getClient()
        return client(
            ReportMDNSEntries, system_id=client.localIdent, mdns=mdns)

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Ouch; looks like this bug was also merged to MAAS 2.1, via the following merge:

https://code.launchpad.net/~blake-rouse/maas/rpc-client-now-2.1/+merge/317249

Changed in maas:
status: Triaged → In Progress
assignee: nobody → Mike Pontillo (mpontillo)
Revision history for this message
Thiago Martins (martinx) wrote :

This is also happening with:

maas-2.1.4+bzr5591-0ubuntu1~16.04.1

Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers