Fail to submit new builds to Jenkins after a spec upgrade

Bug #1419890 reported by Caio Begotti
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Capomastro
Fix Committed
Low
Caio Begotti

Bug Description

Capomastro simply cannot run builds on Jenkins anymore after a spec upgrade. Apparently this is because unit jenkins/0/'s /var/lib/jenkins is inside a persistent volume, and this is kept after migrations or upgrades of the spec, which somehow leads to some inconsistency in Capomastro's database?

I can reproduce it on Staging after a normal deployment + a few dependencies builds + complete redeployment of the spec (using the same Jenkins storage volume). IS has the exact same scenario and possibly would have caught this before, but I decided to file a bug so we can investigate it a bit more when Capomastro goes live.

To be honest the log from this failure does not look related at all, but this is what I get from Celery when this happens:

[2015-02-10 11:06:48,621: INFO/MainProcess] Received task: jenkins.tasks.push_job_to_jenkins[4f71353f-846a-46e6-bdaf-8f02c3a65827]
[2015-02-10 11:06:48,622: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x7f58739ed578> (args:('jenkins.tasks.push_job_to_jenkins', '4f71353f-846a-46e6-bdaf-8f02c3a65827', (1,), {}, {'utc': True, u'is_eager': False, 'chord': None, u'group': None, 'args': (1,), 'retries': 0, u'delivery_info': {u'priority': None, u'redelivered': False, u'routing_key': u'celery', u'exchange': u'celery'}, 'expires': None, u'hostname': 'celery@juju-stg-pes-capomastro-machine-4', 'task': 'jenkins.tasks.push_job_to_jenkins', 'callbacks': None, u'correlation_id': u'4f71353f-846a-46e6-bdaf-8f02c3a65827', 'errbacks': None, 'timelimit': (None, None), 'taskset': None, 'kwargs': {}, 'eta': None, u'reply_to': u'46ea3a5e-48b8-3a64-80ed-edb6b72a9eb8', 'id': '4f71353f-846a-46e6-bdaf-8f02c3a65827', u'headers': {}}) kwargs:{})
[2015-02-10 11:06:48,633: DEBUG/MainProcess] Task accepted: jenkins.tasks.push_job_to_jenkins[4f71353f-846a-46e6-bdaf-8f02c3a65827] pid:18002
[2015-02-10 11:06:48,647: DEBUG/Worker-1] (0.001) SELECT "jenkins_job"."id", "jenkins_job"."server_id", "jenkins_job"."jobtype_id", "jenkins_job"."name" FROM "jenkins_job" WHERE "jenkins_job"."id" = 1 ; args=(1,)
[2015-02-10 11:06:48,649: DEBUG/Worker-1] (0.001) SELECT "jenkins_jenkinsserver"."id", "jenkins_jenkinsserver"."name", "jenkins_jenkinsserver"."url", "jenkins_jenkinsserver"."username", "jenkins_jenkinsserver"."password" FROM "jenkins_jenkinsserver" WHERE "jenkins_jenkinsserver"."id" = 1 ; args=(1,)
[2015-02-10 11:06:48,650: DEBUG/Worker-1] (0.000) SELECT "jenkins_jobtype"."id", "jenkins_jobtype"."name", "jenkins_jobtype"."description", "jenkins_jobtype"."config_xml" FROM "jenkins_jobtype" WHERE "jenkins_jobtype"."id" = 4 ; args=(4,)
[2015-02-10 11:06:48,728: INFO/Worker-1] Starting new HTTP connection (1): 10.35.95.76
[2015-02-10 11:06:48,731: DEBUG/Worker-1] Setting read timeout to None
[2015-02-10 11:06:49,620: DEBUG/Worker-1] "GET /api/python HTTP/1.1" 500 13424
[2015-02-10 11:06:49,641: ERROR/MainProcess] Task jenkins.tasks.push_job_to_jenkins[4f71353f-846a-46e6-bdaf-8f02c3a65827] raised unexpected: ContentDecodingError(DecodeError('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing: incorrect header check',)),)
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 218, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 398, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/jenkins/tasks.py", line 36, in push_job_to_jenkins
    client = job.server.get_client()
  File "/usr/lib/python2.7/dist-packages/jenkins/models.py", line 27, in get_client
    self.url, username=self.username, password=self.password)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkins.py", line 43, in __init__
    JenkinsBase.__init__(self, baseurl)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkinsbase.py", line 33, in __init__
    self.poll()
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkinsbase.py", line 55, in poll
    self._data = self._poll()
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkinsbase.py", line 59, in _poll
    return self.get_data(url)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/jenkinsbase.py", line 63, in get_data
    response = requester.get_url(url, params)
  File "/usr/lib/python2.7/dist-packages/jenkinsapi/utils/requester.py", line 89, in get_url
    return requests.get(self._update_url_scheme(url), **requestKwargs)
  File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 394, in send
    r.content
  File "/usr/lib/python2.7/dist-packages/requests/models.py", line 679, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "/usr/lib/python2.7/dist-packages/requests/models.py", line 621, in generate
    raise ContentDecodingError(e)
ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing: incorrect header check',))

Jenkins logs the following during this error:

Feb 10, 2015 11:06:48 AM org.eclipse.jetty.util.log.JavaUtilLog warn
WARNING: /api/python
java.util.NoSuchElementException
 at java.util.concurrent.ConcurrentHashMap$HashIterator.nextEntry(ConcurrentHashMap.java:1354)
 at java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:1384)
 at org.kohsuke.stapler.WebApp.getSomeStapler(WebApp.java:275)
 at hudson.init.impl.InstallUncaughtExceptionHandler$1.reportException(InstallUncaughtExceptionHandler.java:27)
 at org.kohsuke.stapler.compression.CompressionFilter.reportException(CompressionFilter.java:63)
 at org.kohsuke.stapler.compression.CompressionFilter.doFilter(CompressionFilter.java:58)
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1482)
 at hudson.util.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:81)
 at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1474)
 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499)
 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
 at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
 at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
 at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428)
 at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
 at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
 at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
 at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
 at org.eclipse.jetty.server.Server.handle(Server.java:370)
 at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
 at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
 at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
 at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
 at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
 at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
 at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:668)
 at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
 at winstone.BoundedExecutorService$1.run(BoundedExecutorService.java:77)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

Revision history for this message
Caio Begotti (caio1982) wrote :

I will be checking this deeper now but I suspect this may not be a problem on production because of the way IS upgrade specs keeping the storage volume intact for Jenkins. On my tests I am simply destroying everything and restarting from scratch, while the storage is kept intact. IS won't probably (nor could) just destroy and start afresh like that.

description: updated
description: updated
Revision history for this message
Caio Begotti (caio1982) wrote :

According to IS they never had this problem before but this is simply because nobody had the same scenario we do with Capomastro so we will need to investigate further. The best idea so far (thanks to Daniel) is to reproduce the issue but keep the original /var/lib/jenkins so we can diff the old and the new stale one and see what's in there after the upgrade.

Revision history for this message
Daniel Manrique (roadmr) wrote :

I'll set this for future pending some triaging. We decided it's not critical if the deployment is upgraded rather than blasted and redone (but keeping the jenkins volume). If we redeploy, we can also recreate the jenkins volume to get a clean slate which should help with this problem.

Changed in capomastro:
status: New → Confirmed
importance: Undecided → Low
milestone: none → future
Revision history for this message
Caio Begotti (caio1982) wrote :

The linked branch contains a workaround for a bug in the Jenkins charm that led to this error in Capomastro deployments. I've tested it on Wendigo creating a dependency, redeployment the service, creating another new dependency and they both were imported by Jenkins before and after the redeployment (which caused an error before the code from this branch).

Changed in capomastro:
assignee: nobody → Caio Begotti (caio1982)
status: Confirmed → Fix Committed
Revision history for this message
Caio Begotti (caio1982) wrote :

Committed/merged in the IS branch of the spec, but this is not on production yet.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.