Activity log for bug #1279879

Date Who What changed Old value New value Message
2014-02-13 16:33:16 Matt Bruzek bug added bug
2014-02-13 16:33:16 Matt Bruzek attachment added I ran this test again and machine #2 failed. This is the all-machines.log from the bootstrap node. I was unable to contact machine 2 to download it's log files. https://bugs.launchpad.net/bugs/1279879/+attachment/3979841/+files/all-machines.log
2014-02-13 16:44:19 Matt Bruzek description I am not sure if this is juju-core or juju-deployer. I am writing some juju tests for the charm store and when I deploy multiple instances of ceph and multiple instances of ceph-osd generates an error in on the machines that is visible using juju status. The error is on machine #7. error: failed to get list of flavour details caused by: Maximum number of attempts (3) reached sending request to https://az-1.region-a.geo-1.c ompute.hpcloudsvc.com/v1.1/17031369947864/flavors/detail)' This error is reproducible, but it is not always machine #7. $ juju status environment: hp-mbruzek machines: "0": agent-state: started agent-version: 1.17.2 dns-name: 15.185.107.170 instance-id: "3548727" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M "1": agent-state: started agent-version: 1.17.2 dns-name: 15.185.119.227 instance-id: "3548835" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M "2": agent-state: started agent-version: 1.17.2 dns-name: 15.185.127.215 instance-id: "3548839" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M "3": agent-state: started agent-version: 1.17.2 dns-name: 15.185.89.123 instance-id: "3548843" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M "4": agent-state: started agent-version: 1.17.2 dns-name: 15.185.90.252 instance-id: "3548847" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M "5": agent-state: started agent-version: 1.17.2 dns-name: 15.185.100.131 instance-id: "3548845" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M "6": agent-state: started agent-version: 1.17.2 dns-name: 15.185.115.246 instance-id: "3548851" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M "7": agent-state-info: '(error: failed to get list of flavour details caused by: Maximum number of attempts (3) reached sending request to https://az-1.region-a.geo-1.c ompute.hpcloudsvc.com/v1.1/17031369947864/flavors/detail)' instance-id: pending series: precise "8": agent-state: started agent-version: 1.17.2 dns-name: 15.185.90.148 instance-id: "3548853" instance-state: ACTIVE series: precise hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M services: ceph: charm: local:precise/ceph-104 exposed: true relations: mon: - ceph units: ceph/0: agent-state: started agent-version: 1.17.2 machine: "1" public-address: 15.185.119.227 ceph/1: agent-state: started agent-version: 1.17.2 machine: "2" public-address: 15.185.127.215 ceph/2: agent-state: started agent-version: 1.17.2 machine: "3" public-address: 15.185.89.123 ceph-osd: charm: local:precise/ceph-osd-14 exposed: true units: ceph-osd/0: agent-state: started agent-version: 1.17.2 machine: "4" public-address: 15.185.90.252 ceph-osd/1: agent-state: started agent-version: 1.17.2 machine: "5" public-address: 15.185.100.131 ceph-osd/2: agent-state: started agent-version: 1.17.2 machine: "6" public-address: 15.185.115.246 ceph-osd-sentry: charm: local:precise/ceph-osd-sentry-0 exposed: true ceph-radosgw: charm: local:precise/ceph-radosgw-25 exposed: true units: ceph-radosgw/0: agent-state: pending machine: "7" ceph-radosgw-sentry: charm: local:precise/ceph-radosgw-sentry-0 exposed: true ceph-sentry: charm: local:precise/ceph-sentry-0 exposed: true relation-sentry: charm: local:precise/relation-sentry-0 exposed: true units: relation-sentry/0: agent-state: started agent-version: 1.17.2 machine: "8" open-ports: - 9001/tcp public-address: 15.185.90.148 The test that I am running is an Amulet test to verify the ceph charm is working. I believe the following snippet will generate the error. To get amulet: sudo add-apt-repostiory -y ppa:juju/stable sudo apt-get update sudo apt-get install -y amulet #!/usr/bin/python3 # This amulet code tests the ceph charm. import amulet # The ceph units should be an odd number greater than 3. scale = 3 # The number of seconds to wait for the environment to set up. seconds = 900 # Hardcode a uuid for the ceph cluster. fsid = 'ecbb8960-0e21-11e2-b495-83a88f44db01' # A ceph-authtool key pregenerated for this test. cephAuthKey = 'AQA2zfJSUNjaJBAAmxH/PBRkORMkexRD+2eEHg==' # The device (directory) to use for block storage for the ceph charms. ceph_device = '/srv/ceph' # The havana version of ceph supports directories as devices! havana = 'cloud:precise-updates/havana' # Create a dictionary of configuration values for the ceph charms. ceph_configuration = { 'auth-supported': 'cephx', 'fsid': fsid, 'monitor-count': 3, 'monitor-secret': cephAuthKey, 'osd-devices': ceph_device, 'osd-journal': ceph_device, 'osd-journal-size': 2048, 'osd-format': 'ext4', 'osd-reformat': 'yes', # Setting this value to anything will reformat. 'source': havana } # The device (directory) to use for block storage for the osdL charms. osd_device = '/srv/osd' # Create a configuration dictionary for ceph-osd charms. osd_configuration = { 'osd-devices': osd_device, 'source': havana } rados_configuration = { 'source': havana } d = amulet.Deployment() # Add the number of units of ceph to the deployment. d.add('ceph', units=scale) # Add the number of ceph-osd units to the deployment d.add('ceph-osd', units=scale) # Add ceph-radosgw charm to the deployment. d.add('ceph-radosgw') # The ceph charm requires configuration to deploy successfully. d.configure('ceph', ceph_configuration) # The ceph-osd charm requires configuration to deploy correctly. d.configure('ceph-osd', osd_configuration) # Configure the ceph-radosgw charm with the same version of openstack d.configure('ceph-radosgw', rados_configuration) # Relate ceph and ceph-osd. d.relate('ceph:osd', 'ceph-osd:mon') # Relate ceph and ceph-radosgw d.relate('ceph:radosgw', 'ceph-radosgw:mon') # Expose ceph d.expose('ceph') # Expose ceph-osd d.expose('ceph-osd') # Expose ceph-radosgw d.expose('ceph-radosgw') # Perform deployment. try: d.setup(timeout=seconds) d.sentry.wait(seconds) except amulet.helpers.TimeoutError: message = 'The environment did not setup in %d seconds.' % seconds amulet.raise_status(amulet.SKIP, msg=message) except: raise print('The ceph units successfully deployed!') The test times out because there was an error with one or more of the machines. Unable to ssh to the error machine because it does not have a public ip address. Therefore getting the logs from the machine is not possible. Is there any more information that would be helpful to this bug? I am not sure if this is juju-core or juju-deployer. I am writing some juju tests for the charm store and when I deploy multiple instances of ceph and multiple instances of ceph-osd generates an error in on the machines that is visible using juju status. The error is on machine #7. error: failed to get list of flavour details       caused by: Maximum number of attempts (3) reached sending request to https://az-1.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/flavors/detail)' This error is reproducible, but it is not always machine #7. $ juju status environment: hp-mbruzek machines:   "0":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.107.170     instance-id: "3548727"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M   "1":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.119.227     instance-id: "3548835"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M   "2":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.127.215     instance-id: "3548839"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M   "3":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.89.123     instance-id: "3548843"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M   "4":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.90.252     instance-id: "3548847"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M   "5":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.100.131     instance-id: "3548845"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M   "6":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.115.246     instance-id: "3548851"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M   "7":     agent-state-info: '(error: failed to get list of flavour details       caused by: Maximum number of attempts (3) reached sending request to https://az-1.region-a.geo-1.c ompute.hpcloudsvc.com/v1.1/17031369947864/flavors/detail)'     instance-id: pending     series: precise   "8":     agent-state: started     agent-version: 1.17.2     dns-name: 15.185.90.148     instance-id: "3548853"     instance-state: ACTIVE     series: precise     hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M services:   ceph:     charm: local:precise/ceph-104     exposed: true     relations:       mon:       - ceph     units:       ceph/0:         agent-state: started         agent-version: 1.17.2         machine: "1"         public-address: 15.185.119.227       ceph/1:         agent-state: started         agent-version: 1.17.2         machine: "2"         public-address: 15.185.127.215       ceph/2:         agent-state: started         agent-version: 1.17.2         machine: "3"         public-address: 15.185.89.123   ceph-osd:     charm: local:precise/ceph-osd-14     exposed: true     units:       ceph-osd/0:         agent-state: started         agent-version: 1.17.2         machine: "4"         public-address: 15.185.90.252       ceph-osd/1:         agent-state: started         agent-version: 1.17.2         machine: "5"         public-address: 15.185.100.131       ceph-osd/2:         agent-state: started         agent-version: 1.17.2         machine: "6"         public-address: 15.185.115.246   ceph-osd-sentry:     charm: local:precise/ceph-osd-sentry-0     exposed: true   ceph-radosgw:     charm: local:precise/ceph-radosgw-25     exposed: true     units:       ceph-radosgw/0:         agent-state: pending         machine: "7"   ceph-radosgw-sentry:     charm: local:precise/ceph-radosgw-sentry-0     exposed: true   ceph-sentry:     charm: local:precise/ceph-sentry-0     exposed: true   relation-sentry:     charm: local:precise/relation-sentry-0     exposed: true     units:       relation-sentry/0:         agent-state: started         agent-version: 1.17.2         machine: "8"         open-ports:         - 9001/tcp         public-address: 15.185.90.148 The test that I am running is an Amulet test to verify the ceph charm is working. I believe the following snippet will generate the error. To get amulet:  sudo add-apt-repostiory -y ppa:juju/stable  sudo apt-get update  sudo apt-get install -y amulet #!/usr/bin/python3 # This amulet code tests the ceph charm. import amulet # The ceph units should be an odd number greater than 3. scale = 3 # The number of seconds to wait for the environment to set up. seconds = 900 # Hardcode a uuid for the ceph cluster. fsid = 'ecbb8960-0e21-11e2-b495-83a88f44db01' # A ceph-authtool key pregenerated for this test. cephAuthKey = 'AQA2zfJSUNjaJBAAmxH/PBRkORMkexRD+2eEHg==' # The device (directory) to use for block storage for the ceph charms. ceph_device = '/srv/ceph' # The havana version of ceph supports directories as devices! havana = 'cloud:precise-updates/havana' # Create a dictionary of configuration values for the ceph charms. ceph_configuration = {     'auth-supported': 'cephx',     'fsid': fsid,     'monitor-count': 3,     'monitor-secret': cephAuthKey,     'osd-devices': ceph_device,     'osd-journal': ceph_device,     'osd-journal-size': 2048,     'osd-format': 'ext4',     'osd-reformat': 'yes', # Setting this value to anything will reformat.     'source': havana } # The device (directory) to use for block storage for the osdL charms. osd_device = '/srv/osd' # Create a configuration dictionary for ceph-osd charms. osd_configuration = {     'osd-devices': osd_device,     'source': havana } rados_configuration = {     'source': havana } d = amulet.Deployment() # Add the number of units of ceph to the deployment. d.add('ceph', units=scale) # Add the number of ceph-osd units to the deployment d.add('ceph-osd', units=scale) # Add ceph-radosgw charm to the deployment. d.add('ceph-radosgw') # The ceph charm requires configuration to deploy successfully. d.configure('ceph', ceph_configuration) # The ceph-osd charm requires configuration to deploy correctly. d.configure('ceph-osd', osd_configuration) # Configure the ceph-radosgw charm with the same version of openstack d.configure('ceph-radosgw', rados_configuration) # Relate ceph and ceph-osd. d.relate('ceph:osd', 'ceph-osd:mon') # Relate ceph and ceph-radosgw d.relate('ceph:radosgw', 'ceph-radosgw:mon') # Expose ceph d.expose('ceph') # Expose ceph-osd d.expose('ceph-osd') # Expose ceph-radosgw d.expose('ceph-radosgw') # Perform deployment. try:     d.setup(timeout=seconds)     d.sentry.wait(seconds) except amulet.helpers.TimeoutError:     message = 'The environment did not setup in %d seconds.' % seconds     amulet.raise_status(amulet.SKIP, msg=message) except:     raise print('The ceph units successfully deployed!') The test times out because there was an error with one or more of the machines. Unable to ssh to the error machine because it does not have a public ip address. Therefore getting the logs from the machine is not possible. Is there any more information that would be helpful to this bug?
2014-02-13 18:57:10 Curtis Hovey juju-core: status New Triaged
2014-02-13 18:57:35 Curtis Hovey juju-core: importance Undecided High
2014-02-13 18:58:58 Curtis Hovey juju-core: milestone 1.18.0
2014-02-13 20:08:21 Matt Bruzek tags audit
2014-04-14 10:09:09 Khairul Aizat Kamarudzzaman bug added subscriber Khairul Aizat Kamarudzzaman
2014-05-12 14:35:26 Canonical Juju QA Bot juju-core: milestone 1.20.0 next-stable
2014-06-09 20:18:59 Raghuram Kota tags audit audit hs-arm64
2014-06-09 20:22:00 Raghuram Kota tags audit hs-arm64 arm64 audit hs-arm64
2014-11-04 23:01:42 Curtis Hovey juju-core: importance High Medium
2014-11-04 23:01:50 Curtis Hovey juju-core: milestone 1.21
2016-04-12 20:28:33 Curtis Hovey juju-core: status Triaged Won't Fix