If a machine is booted manually when in status "Declared" or "Ready", TFTP server tracebacks

Bug #1064212 reported by Julian Edwards
34
This bug affects 7 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Julian Edwards

Bug Description

To recreate.
1. Successfully enlist a node
2. Turn the node on so that it PXE boots.
3. See the error as below in the pserv.log.

This is happening because the node is not in the right state to be booted yet so the API request for the pxe config fails. This should be handled a bit more gracefully - perhaps a SAY that tells the user to accept the node first.

2012-10-09 16:27:38+1000 [-] Starting factory <HTTPClientFactory: http://localhost/MAAS/api/1.0/pxeconfig/?mac=e4-11-5b-13-7b-36>
2012-10-09 16:27:38+1000 [HTTPPageGetter,client] Unhandled error in Deferred:
2012-10-09 16:27:38+1000 [HTTPPageGetter,client] Unhandled Error
        Traceback (most recent call last):
          File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 368, in callback
            self._startRunCallbacks(result)
          File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 464, in _startRunCallbacks
            self._runCallbacks()
          File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 551, in _runCallbacks
            current.result = callback(current.result, *args, **kw)
          File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1101, in gotResult
            _inlineCallbacks(r, g, deferred)
        --- <exception caught here> ---
          File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 1043, in _inlineCallbacks
            result = result.throwExceptionIntoGenerator(g)
          File "/usr/lib/python2.7/dist-packages/twisted/python/failure.py", line 382, in throwExceptionIntoGenerator
            return g.throw(self.type, self.value, self.tb)
          File "/usr/lib/python2.7/dist-packages/tftp/protocol.py", line 55, in _startSession
            fs_interface = yield self.backend.get_reader(datagram.filename)
          File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 551, in _runCallbacks
            current.result = callback(current.result, *args, **kw)
          File "/usr/lib/python2.7/dist-packages/provisioningserver/tftp.py", line 169, in generate_config
            kernel_params=kernel_params, **params)
          File "/usr/lib/python2.7/dist-packages/provisioningserver/pxe/config.py", line 83, in render_pxe_config
            kernel_params.subarch)
          File "/usr/lib/python2.7/dist-packages/provisioningserver/pxe/config.py", line 70, in get_pxe_template
            "No PXE template found in %r!" % template_dir)
        exceptions.AssertionError: No PXE template found in '/usr/lib/python2.7/dist-packages/provisioningserver/pxe'!

Related branches

Changed in maas:
status: New → Triaged
importance: Undecided → Low
Revision history for this message
Julian Edwards (julian-edwards) wrote :

I am raising the priority of this because it also slows down enlisting TFTP - the node requests its MAC address which results in the above log traceback and a delay while the page fetch times out.

Changed in maas:
importance: Low → High
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Actually - it's not the case, so lowering again.

Changed in maas:
importance: High → Low
Revision history for this message
Richard Brady (brady) wrote :

Same applies when in Ready state too.

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

As brady notes, this also applies in the ready state, and it still happens in quantal maas.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Howdy!

If the machine is in "Declared" status and you manually boot the machine and PXE boots against the MAAS server, then the PXE boot should fail and MAAS should do nothing, because the machine needs to get "Accepted" first in other for MAAS to instruct the machine to do the commissioning stage.

However, the fail log above should probably error out gracefully rather than showing a traceback.

Revision history for this message
Ramon Acedo (ramon-linux-labs) wrote :

+1 on comment #5 from Andres. The log files should warn that the node in question is in Declared or Ready state and it must be allocated before it can be started again.

It happens on and on during labs and tests especially and handling this error gracefully can save troubleshooting time.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Andres, do you know a way of sending some PXE config that will make the machine power off again?

summary: - If a machine is booted manually when in status "declared", TFTP server
- tracebacks
+ If a machine is booted manually when in status "Declared" or "Ready",
+ TFTP server tracebacks
Changed in maas:
importance: Low → High
Changed in maas:
milestone: none → 14.04
status: Triaged → In Progress
assignee: nobody → Julian Edwards (julian-edwards)
Changed in maas:
status: In Progress → Fix Committed
Revision history for this message
Jorge Niedbalski (niedbalski) wrote :

Hello,

I am not sure on the current status of this bug. Appears to be 'Fix commited' but after reading to the associated
merge request it appears to be just a better exception message.

This is still happening if a machine is booted manually when in status "Declared" or "Ready".

Could you provide a better expl. about why this exception is being raised?

Revision history for this message
Julian Edwards (julian-edwards) wrote :

The bottom line is that the machine should never be booted manually unless you have a problem with power management and you already put the node in the right state.

There is nothing maas can do to avoid a human action like this, which is why the exception is raised.

Revision history for this message
Jorge Niedbalski (niedbalski) wrote :

Thanks @julian-edwards for the clarification.

Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers