Raindance shouldn't use wget -t0; edge cases here

Bug #1436154 reported by justinsb on 2015-03-25
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloudfoundry
Undecided
Unassigned

Bug Description

I hit a problem where the download was just stalled when deploying CF using the Juju charms.

The error was while downloading buildpack_php-f94e99cdb89dca28742a0f9ae816dcd34375ba97.tgz.

wget should not use -t0; this means it never times out. I would expect a high timeout with a retry.

https://github.com/whitmo/raindance/blob/master/raindance/package.py#L86

---

Also, when I killed the wget process which looks like it was passing the wrong args to CalledProcessError (?):

2015-03-19 12:07:27 INFO uaa-relation-joined 2015-03-19 12:07:27 URL:http://cf-packages.s3-website-us-east-1.amazonaws.com/cf/packages/buildpack_php-f94e99cdb89dca28742a0f9ae816dcd34375ba97.tgz [290927831/290927831] -> "/srv/artifacts/cf/packages/buildpack_php-f94e99cdb89dca28742a0f9ae816dcd34375ba97.tgz" [1]
2015-03-19 12:26:37 INFO uaa-relation-joined Exception in thread Thread-1:
2015-03-19 12:26:37 INFO uaa-relation-joined Traceback (most recent call last):
2015-03-19 12:26:37 INFO uaa-relation-joined File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
2015-03-19 12:26:37 INFO uaa-relation-joined self.run()
2015-03-19 12:26:37 INFO uaa-relation-joined File "/usr/lib/python2.7/threading.py", line 763, in run
2015-03-19 12:26:37 INFO uaa-relation-joined self.__target(*self.__args, **self.__kwargs)
2015-03-19 12:26:37 INFO uaa-relation-joined File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/process.py", line 214, in _queue_management_worker
2015-03-19 12:26:37 INFO uaa-relation-joined result_item = result_queue.get(block=True)
2015-03-19 12:26:37 INFO uaa-relation-joined File "/usr/lib/python2.7/multiprocessing/queues.py", line 117, in get
2015-03-19 12:26:37 INFO uaa-relation-joined res = self._recv()
2015-03-19 12:26:37 INFO uaa-relation-joined TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())
2015-03-19 12:26:37 INFO uaa-relation-joined

---

Finally, it then doesn't delete the file on error, and it doesn't checksum the tar.gz to know that the file is corrupt. So, when I retry I get:

2015-03-19 12:29:02 ERROR juju.worker.uniter uniter.go:486 hook failed: signal: terminated
2015-03-19 12:29:41 INFO juju-log uaa:56: Skipping service restart: metron_agent
2015-03-19 12:29:41 INFO juju-log uaa:56: Restarting service: cloud_controller_ng
2015-03-19 12:30:02 INFO uaa-relation-joined
2015-03-19 12:30:02 INFO uaa-relation-joined gzip: stdin: unexpected end of file
2015-03-19 12:30:02 INFO uaa-relation-joined tar: Unexpected EOF in archive
2015-03-19 12:30:02 INFO uaa-relation-joined tar: Unexpected EOF in archive
2015-03-19 12:30:02 INFO uaa-relation-joined tar: Error is not recoverable: exiting now
2015-03-19 12:30:02 INFO uaa-relation-joined Traceback (most recent call last):
2015-03-19 12:30:02 INFO uaa-relation-joined File "/var/lib/juju/agents/unit-cc-0/charm/hooks/uaa-relation-joined", line 3, in <module>
2015-03-19 12:30:02 INFO uaa-relation-joined job_manager("cloud-controller-v3")
2015-03-19 12:30:02 INFO uaa-relation-joined File "/var/lib/juju/agents/unit-cc-0/charm/hooks/cloudfoundry/jobs.py", line 26, in job_manager
2015-03-19 12:30:02 INFO uaa-relation-joined manage_services(service_name)
2015-03-19 12:30:02 INFO uaa-relation-joined File "/var/lib/juju/agents/unit-cc-0/charm/hooks/cloudfoundry/jobs.py", line 64, in manage_services
2015-03-19 12:30:02 INFO uaa-relation-joined services.ServiceManager(service_def).manage()
2015-03-19 12:30:02 INFO uaa-relation-joined File "/var/lib/juju/agents/unit-cc-0/charm/hooks/charmhelpers/core/services/base.py", line 128, in manage
2015-03-19 12:30:02 INFO uaa-relation-joined self.reconfigure_services()
2015-03-19 12:30:02 INFO uaa-relation-joined File "/var/lib/juju/agents/unit-cc-0/charm/hooks/charmhelpers/core/services/base.py", line 161, in reconfigure_services
2015-03-19 12:30:02 INFO uaa-relation-joined self.fire_event('data_ready', service_name)
2015-03-19 12:30:02 INFO uaa-relation-joined File "/var/lib/juju/agents/unit-cc-0/charm/hooks/charmhelpers/core/services/base.py", line 210, in fire_event
2015-03-19 12:30:02 INFO uaa-relation-joined callback(service_name)
2015-03-19 12:30:02 INFO uaa-relation-joined File "/var/lib/juju/agents/unit-cc-0/charm/hooks/cloudfoundry/tasks.py", line 115, in install_job_artifacts
2015-03-19 12:30:02 INFO uaa-relation-joined for job_file in job:
2015-03-19 12:30:02 INFO uaa-relation-joined File "/usr/local/lib/python2.7/dist-packages/raindance/package.py", line 182, in setup_job
2015-03-19 12:30:02 INFO uaa-relation-joined self.tarextract(package_file, package_target)
2015-03-19 12:30:02 INFO uaa-relation-joined File "/usr/local/lib/python2.7/dist-packages/raindance/package.py", line 160, in tarextract
2015-03-19 12:30:02 INFO uaa-relation-joined subprocess.check_call(['tar', '-xzf', tarball] + list(args))
2015-03-19 12:30:02 INFO uaa-relation-joined File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
2015-03-19 12:30:02 INFO uaa-relation-joined raise CalledProcessError(retcode, cmd)
2015-03-19 12:30:02 INFO uaa-relation-joined subprocess.CalledProcessError: Command '['tar', '-xzf', path(u'/srv/artifacts/cf/packages/buildpack_nodejs-83c9b43b7b340a80dc71c18f9a22cf1b0d05d00b.tgz')]' returned non-zero exit status 2
2015-03-19 12:30:02 ERROR juju.worker.uniter uniter.go:486 hook failed: exit status 1

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers