Juju error on HP cloud with Maximum number of attempts (3) reached sending request
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
juju-core |
Won't Fix
|
Medium
|
Unassigned |
Bug Description
I am not sure if this is juju-core or juju-deployer.
I am writing some juju tests for the charm store and when I deploy multiple instances of ceph and multiple instances of ceph-osd generates an error in on the machines that is visible using juju status.
The error is on machine #7.
error: failed to get list of flavour details
caused by: Maximum number of attempts (3) reached sending request to https:/
This error is reproducible, but it is not always machine #7.
$ juju status
environment: hp-mbruzek
machines:
"0":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.107.170
instance-id: "3548727"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
"1":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.119.227
instance-id: "3548835"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
"2":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.127.215
instance-id: "3548839"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
"3":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.89.123
instance-id: "3548843"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
"4":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.90.252
instance-id: "3548847"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
"5":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.100.131
instance-id: "3548845"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
"6":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.115.246
instance-id: "3548851"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
"7":
agent-
caused by: Maximum number of attempts (3) reached sending request to https:/
ompute.
instance-id: pending
series: precise
"8":
agent-state: started
agent-version: 1.17.2
dns-name: 15.185.90.148
instance-id: "3548853"
instance-state: ACTIVE
series: precise
hardware: arch=amd64 cpu-cores=1 mem=1024M root-disk=30720M
services:
ceph:
charm: local:precise/
exposed: true
relations:
mon:
- ceph
units:
ceph/0:
machine: "1"
ceph/1:
machine: "2"
ceph/2:
machine: "3"
ceph-osd:
charm: local:precise/
exposed: true
units:
ceph-osd/0:
machine: "4"
ceph-osd/1:
machine: "5"
ceph-osd/2:
machine: "6"
ceph-osd-sentry:
charm: local:precise/
exposed: true
ceph-radosgw:
charm: local:precise/
exposed: true
units:
ceph-
machine: "7"
ceph-
charm: local:precise/
exposed: true
ceph-sentry:
charm: local:precise/
exposed: true
relation-sentry:
charm: local:precise/
exposed: true
units:
relation-
machine: "8"
open-ports:
- 9001/tcp
The test that I am running is an Amulet test to verify the ceph charm is working. I believe the following snippet will generate the error. To get amulet:
sudo add-apt-repostiory -y ppa:juju/stable
sudo apt-get update
sudo apt-get install -y amulet
#!/usr/bin/python3
# This amulet code tests the ceph charm.
import amulet
# The ceph units should be an odd number greater than 3.
scale = 3
# The number of seconds to wait for the environment to set up.
seconds = 900
# Hardcode a uuid for the ceph cluster.
fsid = 'ecbb8960-
# A ceph-authtool key pregenerated for this test.
cephAuthKey = 'AQA2zfJSUNjaJB
# The device (directory) to use for block storage for the ceph charms.
ceph_device = '/srv/ceph'
# The havana version of ceph supports directories as devices!
havana = 'cloud:
# Create a dictionary of configuration values for the ceph charms.
ceph_configuration = {
'auth-
'fsid': fsid,
'monitor-
'monitor-
'osd-devices': ceph_device,
'osd-journal': ceph_device,
'osd-
'osd-format': 'ext4',
'osd-reformat': 'yes', # Setting this value to anything will reformat.
'source': havana
}
# The device (directory) to use for block storage for the osdL charms.
osd_device = '/srv/osd'
# Create a configuration dictionary for ceph-osd charms.
osd_configuration = {
'osd-devices': osd_device,
'source': havana
}
rados_configuration = {
'source': havana
}
d = amulet.Deployment()
# Add the number of units of ceph to the deployment.
d.add('ceph', units=scale)
# Add the number of ceph-osd units to the deployment
d.add('ceph-osd', units=scale)
# Add ceph-radosgw charm to the deployment.
d.add('
# The ceph charm requires configuration to deploy successfully.
d.configure('ceph', ceph_configuration)
# The ceph-osd charm requires configuration to deploy correctly.
d.configure(
# Configure the ceph-radosgw charm with the same version of openstack
d.configure(
# Relate ceph and ceph-osd.
d.relate(
# Relate ceph and ceph-radosgw
d.relate(
# Expose ceph
d.expose('ceph')
# Expose ceph-osd
d.expose(
# Expose ceph-radosgw
d.expose(
# Perform deployment.
try:
d.setup(
d.sentry.
except amulet.
message = 'The environment did not setup in %d seconds.' % seconds
amulet.
except:
raise
print('The ceph units successfully deployed!')
The test times out because there was an error with one or more of the machines. Unable to ssh to the error machine because it does not have a public ip address. Therefore getting the logs from the machine is not possible.
Is there any more information that would be helpful to this bug?
description: | updated |
Changed in juju-core: | |
status: | New → Triaged |
importance: | Undecided → High |
milestone: | none → 1.18.0 |
tags: | added: audit |
Changed in juju-core: | |
milestone: | 1.20.0 → next-stable |
tags: | added: hs-arm64 |
tags: | added: arm64 |
Changed in juju-core: | |
importance: | High → Medium |
milestone: | 1.21 → none |
Changed in juju-core: | |
status: | Triaged → Won't Fix |
I'm seeing this on EC2 when I deploy a handful of machines using juju-deployer.
machines: 18-247- 146.us- west-1. compute. amazonaws. com state-info: '(error: cannot set up groups: Request limit exceeded. (RequestLimitEx ceeded) )' 219-107- 61.us-west- 1.compute. amazonaws. com 236-184- 129.us- west-1. compute. amazonaws. com 18-99-36. us-west- 1.compute. amazonaws. com 219-226- 138.us- west-1. compute. amazonaws. com
"0":
agent-state: started
agent-version: 1.17.4
dns-name: ec2-50-
instance-id: i-2091057c
instance-state: running
series: precise
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"1":
agent-
instance-id: pending
series: precise
"2":
agent-state: started
agent-version: 1.17.4
dns-name: ec2-54-
instance-id: i-819e0add
instance-state: running
series: precise
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"3":
agent-state: started
agent-version: 1.17.4
dns-name: ec2-204-
instance-id: i-ad9105f1
instance-state: running
series: precise
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"4":
agent-state: started
agent-version: 1.17.4
dns-name: ec2-50-
instance-id: i-ac9105f0
instance-state: running
series: precise
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M
"5":
agent-state: started
agent-version: 1.17.4
dns-name: ec2-54-
instance-id: i-58930704
instance-state: running
series: precise
hardware: arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M