Provisioning failed with UTAHBMProvisioningException: Cobbler command failed: system edit --name=dx-autopilot-nvidia --netboot-enabled=N

Bug #1100176 reported by Jean-Baptiste Lallement
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
UTAH
Triaged
High
Unassigned

Bug Description

Provisioning of a system with an nvidia card failed with:

2013-01-16 02:34:33,387 dx-autopilot-nvidia WARNING: Return code: 1
2013-01-16 02:34:33,387 dx-autopilot-nvidia ERROR: Cobbler command failed: system edit --name=dx-autopilot-nvidia --netboot-enabled=N
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/utah/run.py", line 128, in run_tests
    machine.installclient()
  File "/usr/lib/python2.7/dist-packages/utah/provisioning/provisioning.py", line 405, in installclient
    raise err
UTAHBMProvisioningException: Cobbler command failed: system edit --name=dx-autopilot-nvidia --netboot-enabled=N
2013-01-16 02:34:33,477 dx-autopilot-nvidia ERROR: Failed to install client. Writing empty reports.

Blocking daily publication of unity indicators.

http://10.97.0.1:8080/job/ps-indicators-autopilot-release-testing/label=autopilot-nvidia/74/

Output of the console attached.

Tags: ue-desktop
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Javier Collado (javier.collado) wrote :

One thing I believe we're doing pretty bad is that debug logs for utah server
side aren't present in the jenkins jobs artifacts. This makes problems more
difficult to troubleshoot and require investigation just to get the error error
message that should be in the log (a separate bug will be opened to track this
issue).

In this case, what I find looking at the debug log file for the utah server is:

2013-01-16 02:34:33,168 dx-autopilot-nvidia DEBUG: Running command: sudo cobbler system edit --name=dx-autopilot-nvidia --netboot-enabled=N
2013-01-16 02:34:33,168 dx-autopilot-nvidia DEBUG: Output follows:
2013-01-16 02:34:33,373 dx-autopilot-nvidia DEBUG: ### ERROR ###
2013-01-16 02:34:33,373 dx-autopilot-nvidia DEBUG: Unexpected remote error, check the server side logs for further info
2013-01-16 02:34:33,374 dx-autopilot-nvidia DEBUG: <type 'exceptions.KeyError'>:'1IHI2xVY6vlryAQpbOCeeGDIRUIOnAn6PQ=='
2013-01-16 02:34:33,386 dx-autopilot-nvidia DEBUG:
2013-01-16 02:34:33,387 dx-autopilot-nvidia WARNING: Return code: 1
2013-01-16 02:34:33,387 dx-autopilot-nvidia ERROR: Cobbler command failed: system edit --name=dx-autopilot-nvidia --netboot-enabled=N

Since the cobbler command output recommends to look into the cobbler log, I've
done that and what I've found is:

Wed Jan 16 02:24:30 2013 - DEBUG | get_item; ['system', 'dx-autopilot-nvidia']
Wed Jan 16 02:24:30 2013 - DEBUG | done with get_item; ['system', 'dx-autopilot-nvidia']
Wed Jan 16 02:24:30 2013 - DEBUG | REMOTE CLI Authorized; user(?)
Wed Jan 16 02:34:33 2013 - DEBUG | REMOTE expiring token; user(<DIRECT>)
Wed Jan 16 02:34:33 2013 - DEBUG | REMOTE expiring token; user(<DIRECT>)
Wed Jan 16 02:34:33 2013 - DEBUG | REMOTE CLI Authorized; user(?)
Wed Jan 16 02:34:33 2013 - INFO | Exception occured: <type 'exceptions.KeyError'>
Wed Jan 16 02:34:33 2013 - INFO | Exception value: '1IHI2xVY6vlryAQpbOCeeGDIRUIOnAn6PQ=='
Wed Jan 16 02:34:33 2013 - DEBUG | get_item; ['system', 'dx-autopilot-ati']
Wed Jan 16 02:34:33 2013 - INFO | Exception Info:
  File "/usr/lib/python2.7/dist-packages/cobbler/remote.py", line 1944, in _dispatch
    return method_handle(*params)
   File "/usr/lib/python2.7/dist-packages/cobbler/remote.py", line 857, in xapi_object_edit
    self.check_access(token,"xedit_%s" % object_type, token)
   File "/usr/lib/python2.7/dist-packages/cobbler/remote.py", line 1716, in check_access
    validated = self.__validate_token(token)
   File "/usr/lib/python2.7/dist-packages/cobbler/remote.py", line 1660, in __validate_token
    self.__invalidate_expired_tokens()
   File "/usr/lib/python2.7/dist-packages/cobbler/remote.py", line 1626, in __invalidate_expired_tokens
    del self.token_cache[token]

So the real problem here, based on the exception traceback, is that cobbler
tried to remove a token from an internal cache that wasn't there.

I've looked into the cobbler issues in github and haven't found any that
matches this problem, so I'm not sure if it's caused by an unexpected usage of
the cobbler commands or a race condition (I see in logs that ati and nvidia
test cases were running in parallel).

Changed in utah:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Max Brustkern (nuclearbob) wrote :

I've seen more cobbler system edit failures, including here:
http://10.97.0.1:8080/view/Bootspeed/job/bootspeed-raring-desktop-i386-acer-veriton-02/58/console

We have some new cobbler logging in the recent stable version update. I can look into whether this is properly enabled on the bootspeed jobs.

tags: added: ue-desktop
Revision history for this message
Max Brustkern (nuclearbob) wrote :

This is affecting an increasing number of jobs for both indicators and bootspeed. I'm changing the bootspeed jobs to take advantage of the logging improvements we've deployed to the smoke testing jobs, and looking into updating the indicators jobs as well. I can take a look at existing cobbler command logging in the debug log as well.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.