Activity log for bug #1755155

Date Who What changed Old value New value Message
2018-03-12 12:32:09 Sandor Zeestraten bug added bug
2018-03-12 12:34:35 Sandor Zeestraten description # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. controller `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Tried to manually run relation-get command that the charms are failing on: * The ones for nova-cloud-controller actually work when run manually with `juju run`: $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} * However the ones for nova-compute time out: $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/0 ... $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. Also, why is another unrelated production model getting hit by these issues when only the controller model is upgraded? # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2 # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. ## Controllers `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. ## nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu ## nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Tried to manually run relation-get command that the charms are failing on:   * The ones for nova-cloud-controller actually work when run manually with `juju run`:     $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0'     {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"}      $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0'     {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"}      $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0'     {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"}      * However the ones for nova-compute time out:     $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0'     ERROR timed out waiting for result from: unit nova-compute/0     ...     $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0'     ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. Also, why is another unrelated production model getting hit by these issues when only the controller model is upgraded? # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2
2018-03-12 12:37:35 Sandor Zeestraten description # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. ## Controllers `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. ## nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu ## nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Tried to manually run relation-get command that the charms are failing on:   * The ones for nova-cloud-controller actually work when run manually with `juju run`:     $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0'     {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"}      $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0'     {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"}      $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0'     {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"}      * However the ones for nova-compute time out:     $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0'     ERROR timed out waiting for result from: unit nova-compute/0     ...     $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0'     ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. Also, why is another unrelated production model getting hit by these issues when only the controller model is upgraded? # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2 # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. ## Controllers `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. ## nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu ## nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Running the relation-get command manually for nova-cloud-controller: $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} * Running the relation-get command manually for nova-compute: $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/0 ... $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2
2018-03-12 13:11:49 Sandor Zeestraten description # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. ## Controllers `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. ## nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu ## nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Running the relation-get command manually for nova-cloud-controller: $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} * Running the relation-get command manually for nova-compute: $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/0 ... $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2 # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. ## Controllers `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. ## nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu ## nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers one by one * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Running the relation-get command manually for nova-cloud-controller: $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} * Running the relation-get command manually for nova-compute: $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/0 ... $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2
2018-03-14 10:00:08 John A Meinel juju: importance Undecided High
2018-03-14 10:00:08 John A Meinel juju: status New Incomplete
2018-03-19 04:16:23 John A Meinel marked as duplicate 1697936
2018-03-19 04:18:07 John A Meinel description # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. ## Controllers `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. ## nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu ## nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers one by one * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Running the relation-get command manually for nova-cloud-controller: $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} * Running the relation-get command manually for nova-compute: $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/0 ... $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2 # Quick summary After upgrading a controller from 2.1 => 2.2 or newer, it is necessary to upgrade the model agents as well, or you may get errors about "unexpected end of JSON input". # Issue I upgraded our juju controllers (3x HA, MAAS cloud) from 2.1.2 to 2.3.4 with `juju upgrade-juju -m controller --agent-version 2.3.4`. This took about 15 min and the upgrade seemed to be successfull on the controller model. Directly after this, 2 of the charms deployed in another model started reporting hook failures for all its units. The charms in question are `nova-cloud-controller` and `nova-compute` deployed on a model called `openstack` Please note that the juju environment and charms were working fine before the upgrade. I have tested this exact upgrade scenario (2.1.2 to 2.3.4, 3x HA, same cloud, same base deployment of openstack) multiple times in staging previously without running into this issue. Also, when looking at the logs for controller and the failing juju units, there are some seemingly related connection error messages which weren't there before upgrading. So I think it is safe to say that this is not a charm issue, but a juju issue. Any help would be appreciated. # Logs N.B. All the different units show the same error messages, i.e. nova-cloud-controller/0, nova-cloud-controller/1 and nova-cloud-controller/2 have the same messaging and exact same traceback in the logs. Same goes for the controllers and nova-compute charm, therefore I only added one excerpt from each. ## Controllers `juju status -m controller --format yaml`: https://pastebin.com/kaxpuL8M `juju ssh -m controller 0 'sudo less /var/log/juju/machine-0.log'`: https://pastebin.com/L3hFuZvb All the controllers show some variation of these error messages. The IP's seem to correspond to connections from the juju controller to the nova-cloud-controller units. ## nova-cloud-controller `juju status -m openstack nova-cloud-controller --format yaml`: https://pastebin.com/uiS2BXBX `juju ssh nova-cloud-controller/0 'sudo less /var/log/juju/unit-nova-cloud-controller-0.log'`: https://pastebin.com/GYKBASUu ## nova-compute `juju status -m openstack nova-compute --format yaml`: https://pastebin.com/6XJcLWNL `juju ssh -m openstack nova-compute/0 'sudo less /var/log/juju/unit-nova-compute-0.log'`: https://pastebin.com/Pp4PymTa # Troubleshooting steps * Tried to restart the juju controllers one by one * Tried to restart the nova-cloud-controller and nova-compute juju services (i.e. jujud-unit-nova-cloud-controller-0.service, jujud-unit-nova-compute-0.service and etc) * Running the relation-get command manually for nova-cloud-controller: $ juju run --unit nova-cloud-controller/0 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/1 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} $ juju run --unit nova-cloud-controller/2 'relation-get --format=json -r identity-service:28 - keystone/0' {"admin_token":"redacted","api_version":"2","auth_host":"keystone.maas","auth_port":"35357","auth_protocol":"http","private-address":"aa.bb.2.130","service_host":"keystone.maas","service_password":"redacted","service_port":"5000","service_protocol":"http","service_tenant":"services","service_tenant_id":"4baaf52f802a47fa8309b56c10b95e6c","service_username":"nova"} * Running the relation-get command manually for nova-compute: $ juju run --unit nova-compute/0 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/0 ... $ juju run --unit nova-compute/19 'relation-get --format=json -r cloud-compute:38 network_manager nova-cloud-controller/0' ERROR timed out waiting for result from: unit nova-compute/1 As noted above, running the command which is causing these hook failures manually works fine for the nova-cloud-controller units, but not for the nova-compute units. I do not understand why the former is failing, however the latter is most likely failing as the relation in question is related to the nova-cloud-controller which is also failing. # Versions Juju 2.1.2, 2.3.4 MAAS 2.1.2