[2.2] rackd is not refreshing is commissioning information

Bug #1705508 reported by Andres Rodriguez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Lee Trager
2.2
Fix Released
Critical
Lee Trager

Bug Description

I have a rack controller that is not refreshing its commissioning information. The only error seen in the logs is:

maas.log:Jul 20 14:59:21 polite-bedbug maas.refresh: [error] Error during controller refresh: HTTP error [503]

I also see error's like this in regiond.log:

2017-07-20 14:59:54 stderr: [error] request to http://127.0.0.1:5240/MAAS/metadata/2012-03-01/ failed. sleeping 32.: HTTP Error 503: Service Unavailable

FWIW, rackd.conf:
maas_url: http://10.245.136.4:5240/MAAS

Related branches

description: updated
Changed in maas:
importance: Undecided → Critical
milestone: none → 2.3.0
status: New → Triaged
Revision history for this message
Andres Rodriguez (andreserl) wrote :

seems that rackd.conf tries to access 127.0.0.1 by default instead of whatever is configured in /etc/maas/rackd.conf.

description: updated
Revision history for this message
Andres Rodriguez (andreserl) wrote :

I hardcoded an IP address in /usr/lib/python3/dist-packages/provisioningserver/refresh/__init__.py, but while I dont see the errors in the logs anymore, the commissioning doesn't seem to have been run at all, or if it has, it hasn't been updated.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

after multiple restarts and changes while doing other things, it finally refreshed (this after probably 5 hours after having installed maas, so there's definitely an issue. Specially due to #1).

Revision history for this message
Andres Rodriguez (andreserl) wrote :

note that it refreshed with the work around in #2

Revision history for this message
Lee Trager (ltrager) wrote :

I just built master and went to do a fresh install and my setup script failed. In rackd.log the following error is repeated a number of times. /etc/maas/rackd.conf and /etc/maas/regiond.conf are both owned by root:maas with their permission set to 640. Setting both to 660 fixed the problem for me.

Did packaging change the permission of these files? I used make package-dev to build the deb files.
If the permission is correct MAAS needs to be opening these files with read only permission. The error below happens when we try to touch it, we could touch it only if it doesn't exist.

PermissionError: [Errno 13] Permission denied: '/etc/maas/rackd.conf'
Traceback (most recent call last):
  File "/usr/bin/twistd3", line 18, in <module>
    run()
  File "/usr/lib/python3/dist-packages/twisted/scripts/twistd.py", line 29, in run
    app.run(runApp, ServerOptions)
  File "/usr/lib/python3/dist-packages/twisted/application/app.py", line 617, in run
    runApp(config)
  File "/usr/lib/python3/dist-packages/twisted/scripts/twistd.py", line 25, in runApp
    _SomeApplicationRunner(config).run()
  File "/usr/lib/python3/dist-packages/twisted/application/app.py", line 348, in run
    self.application = self.createOrGetApplication()
  File "/usr/lib/python3/dist-packages/twisted/application/app.py", line 408, in createOrGetApplication
    ser = plg.makeService(self.config.subOptions)
  File "/usr/lib/python3/dist-packages/provisioningserver/plugin.py", line 200, in makeService
    with ClusterConfiguration.open() as config:
  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
    return next(self.gen)
  File "/usr/lib/python3/dist-packages/provisioningserver/config.py", line 697, in open
    with cls.backend.open(filepath) as store:
  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
    return next(self.gen)
  File "/usr/lib/python3/dist-packages/provisioningserver/config.py", line 549, in open
    touch(path)
  File "/usr/lib/python3/dist-packages/provisioningserver/config.py", line 345, in touch
    os.close(os.open(path, os.O_CREAT | os.O_APPEND, mode))

Revision history for this message
Andres Rodriguez (andreserl) wrote : Re: [Bug 1705508] Re: [2.2] rackd is not refreshing is commissioning information

Nothing has changed from the Maas side and I don't think this is related
provided Maas I tested against for this bug is 2.2.0 from the archive.

That said, the config file permission are fixed by the code in case they
are wrong.

On Fri, Jul 21, 2017 at 2:50 AM Lee Trager <email address hidden> wrote:

> I just built master and went to do a fresh install and my setup script
> failed. In rackd.log the following error is repeated a number of times.
> /etc/maas/rackd.conf and /etc/maas/regiond.conf are both owned by
> root:maas with their permission set to 640. Setting both to 660 fixed
> the problem for me.
>
> Did packaging change the permission of these files? I used make
> package-dev to build the deb files.
> If the permission is correct MAAS needs to be opening these files with
> read only permission. The error below happens when we try to touch it, we
> could touch it only if it doesn't exist.
>
> PermissionError: [Errno 13] Permission denied: '/etc/maas/rackd.conf'
> Traceback (most recent call last):
> File "/usr/bin/twistd3", line 18, in <module>
> run()
> File "/usr/lib/python3/dist-packages/twisted/scripts/twistd.py", line
> 29, in run
> app.run(runApp, ServerOptions)
> File "/usr/lib/python3/dist-packages/twisted/application/app.py", line
> 617, in run
> runApp(config)
> File "/usr/lib/python3/dist-packages/twisted/scripts/twistd.py", line
> 25, in runApp
> _SomeApplicationRunner(config).run()
> File "/usr/lib/python3/dist-packages/twisted/application/app.py", line
> 348, in run
> self.application = self.createOrGetApplication()
> File "/usr/lib/python3/dist-packages/twisted/application/app.py", line
> 408, in createOrGetApplication
> ser = plg.makeService(self.config.subOptions)
> File "/usr/lib/python3/dist-packages/provisioningserver/plugin.py", line
> 200, in makeService
> with ClusterConfiguration.open() as config:
> File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
> return next(self.gen)
> File "/usr/lib/python3/dist-packages/provisioningserver/config.py", line
> 697, in open
> with cls.backend.open(filepath) as store:
> File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
> return next(self.gen)
> File "/usr/lib/python3/dist-packages/provisioningserver/config.py", line
> 549, in open
> touch(path)
> File "/usr/lib/python3/dist-packages/provisioningserver/config.py", line
> 345, in touch
> os.close(os.open(path, os.O_CREAT | os.O_APPEND, mode))
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1705508
>
> Title:
> [2.2] rackd is not refreshing is commissioning information
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1705508/+subscriptions
>
--
Andres Rodriguez (RoAkSoAx)
Ubuntu Server Developer
MSc. Telecom & Networking
Systems Engineer

Revision history for this message
Andres Rodriguez (andreserl) wrote :

@Lee, let me re-state. The issue is present and it is not related to the issue you came across. I have a MAAS that has been running for over 24hours just fine, and the rack cannot update its commissioning information because the rack is trying to use '127.0.0.1' instead of what's configured in /etc/maas/rackd.conf maas_url.

Changed in maas:
assignee: nobody → Lee Trager (ltrager)
Revision history for this message
Lee Trager (ltrager) wrote :

Can you post the logs of the rack controller this is happening on? http://127.0.0.1:5240/ is the default value when the maas_url isn't found in the config file which is why I thought the two might of been related.

While trying to debug this issue I came across lp:1705774, reverting 48efa8062b866c9b1b82fdca6be2c0e7067e472e solved the issue for me, I'm wondering if that commit is causing this error as well.

Changed in maas:
status: Triaged → Incomplete
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Lee,

Racks.conf as said, points to an IP. Nothing strange in logs other than
what I already said. Everything g else works as expected (deploys,
commissioning etc)

On Fri, Jul 21, 2017 at 10:41 PM Lee Trager <email address hidden>
wrote:

> Can you post the logs of the rack controller this is happening on?
> http://127.0.0.1:5240/ is the default value when the maas_url isn't
> found in the config file which is why I thought the two might of been
> related.
>
> While trying to debug this issue I came across lp:1705774, reverting
> 48efa8062b866c9b1b82fdca6be2c0e7067e472e solved the issue for me, I'm
> wondering if that commit is causing this error as well.
>
> ** Changed in: maas
> Status: Triaged => Incomplete
>
> ** Changed in: maas/2.2
> Status: Triaged => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1705508
>
> Title:
> [2.2] rackd is not refreshing is commissioning information
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1705508/+subscriptions
>
--
Andres Rodriguez (RoAkSoAx)
Ubuntu Server Developer
MSc. Telecom & Networking
Systems Engineer

Revision history for this message
Lee Trager (ltrager) wrote :

Is a region controller also running on that machine? When a refresh occurs on a region and rack controller the region process runs the refresh and sends the data to itself over HTTP(localhost).

Lee Trager (ltrager)
Changed in maas:
status: Incomplete → In Progress
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
milestone: 2.3.0 → 2.3.0alpha1
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.