Misconfiguration of openstack-origin causes obfuscated charm behaviour

Bug #1607790 reported by Pascal Mazon on 2016-07-29
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack neutron-gateway charm
Low
Unassigned
neutron-gateway (Juju Charms Collection)
Low
Unassigned

Bug Description

neutron-plugin-openvswitch-agent should be running but is not
=============================================================

I have the following topology (see attached opentack-juju.svg for graphical view):
- 1 MAAS server (also used as the Juju client), on trusty
- 1 Juju bootstrap node, on xenial
- 1 controller node, on xenial
- 2 compute nodes, on xenial

They're all connected to the same network, with a single NIC each.

I use MAAS Version 1.9.3+bzr4577-0ubuntu1 (trusty1) to manage my machines (VMs).

I bootstrapped Juju using the environment attached in `environments.yaml`.
After registering the machines onto juju (`juju machine add X.maas`), I manually deploy my charms, and relate them.
For conveniance, I provide in the attached `mybundle.yaml`, the final configuration as seen in the WebGUI.

Unfortunately, everything looks like it runs well, except neutron-gateway:

      root@maas:~# juju status --format tabular neutron-gateway/0
      [Services]
      NAME STATUS EXPOSED CHARM
      neutron-gateway blocked false cs:xenial/neutron-gateway-1

      [Units]
      ID WORKLOAD-STATE AGENT-STATE VERSION MACHINE PORTS PUBLIC-ADDRESS MESSAGE
      neutron-gateway/0 blocked idle 1.25.6 3 controller.maas Services not running that should be: neutron-plugin-openvswitch-agent

      [Machines]
      ID STATE VERSION DNS INS-ID SERIES HARDWARE
      3 started 1.25.6 controller.maas /MAAS/api/1.0/nodes/node-38449baa-4e7f-11e6-93f2-deadbeef0300/ xenial arch=amd64 cpu-cores=2 mem=4096M availability-zone=default

I pasted the complete status output in `juju-status-tabular.txt` and in yaml format in `juju-status.yaml`.

I used a similar setup on trusty (only other difference is that mysql was used instead of percona-cluster), and did not meet that issue, all Units were OK.

Furthermore, I don't know if it is related, but I can't spawn a VM on a compute node, the VM is displayed in ERROR status, but I don't see anything in either nova-compute logs or nova-cloud-controller logs.
The commands attached in `start_vm.sh` are the ones I typed on nova-compute to try and get my VM started (same command worked with my earlier trusty setup).

Pascal Mazon (pascal-mazon-u) wrote :
Liam Young (gnuoy) wrote :

Hi Pascal. Sorry you've hit this issue. Could you confirm whether the neutron-plugin-openvswitch-agent service was running on neutron-gateway/0 ? It would be good to know if this is a bug with the service not being started or a bug with the code assessing the service status.

Could you run "nova show <server UUID>" for the guest that failed to boot and attach the logs from /var/log/nova on the compute nodes as well please?

Thanks
Liam

Pascal Mazon (pascal-mazon-u) wrote :
Download full text (5.8 KiB)

Actually the "neutron-plugin-openvswitch-agent" service does not exist.
However, the "neutron-openvswitch-agent" service is started:

root@controller:~# systemctl status neutron-plugin-openvswitch-agent
● neutron-plugin-openvswitch-agent.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)
root@controller:~# systemctl status neutron-openvswitch-agent
● neutron-openvswitch-agent.service - Openstack Neutron Open vSwitch Plugin Agent
   Loaded: loaded (/lib/systemd/system/neutron-openvswitch-agent.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2016-07-28 10:38:32 UTC; 5 days ago
 Main PID: 17963 (neutron-openvsw)
    Tasks: 1
   Memory: 36.8M
      CPU: 4min 59.762s

Regarding the nova show, here it comes:

+--------------------------------------+-------------------------------------------------+
| Property | Value |
+--------------------------------------+-------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | - |
| OS-EXT-SRV-ATTR:hypervisor_hostname | - |
| OS-EXT-SRV-ATTR:instance_name | instance-00000001 |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | - |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2016-08-03T08:05:25Z |
| fault | {"message": "404 Not Found |
| | |
| | The resource could not be found. |
| | |
| | |
| | Neutron server returns request_ids: ['req-0e122380-69c8-4d9a-ac4f-c0e099c40415']", "code": 500, "details": " File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 1926, in _do_build_and_run_instance |
| | filter_properties) |
| | File \"/usr/lib...

Read more...

James Page (james-page) wrote :

Hi Pascal

neutron-openvswitch-agent is the correct agent, but for some reason the charm is assuming that neutron-plugin-openvswitch-agent should be running (the old name for the same agent).

I did think this might be due to the neutron point release into Xenial (8.1.2) but the charm version you are using has the required changes to support mapping 8.1 -> mitaka, so that should not be the case.

James Page (james-page) wrote :

As the charm is not correctly detecting the neutron version, I suspect its rendering the wrong configuration file for the OpenStack Mitaka release, resulting in your error.

Could you provide the log files for the charm from /var/log/juju/unit-neutron-gateway-0.log please?

Pascal Mazon (pascal-mazon-u) wrote :

Here comes the requested unit-neutron-gateway-0.log.

James Page (james-page) wrote :

OK - figured this out; your bundle contains:

      "openstack-origin": "cloud:xenial-liberty"

for all services; this is not a valid origin configuration for Xenial (which only supports Mitaka or Newton right now).

As a result, the trusty UCA for liberty gets enabled, which in itself is OK (as Xenial package versions are higher), but the charm generates configuration for Liberty, not Mitaka, which includes checking for the older agent name.

OpenStack release is driven directly from the openstack-origin value.

That said, this dies in a subtle way that is not obvious which points to a code smell to me.

James Page (james-page) wrote :

To be clear, only:

 openstack-origin: distro
 openstack-origin: cloud:xenial-newton

is supported for Xenial today.

summary: - neutron-plugin-openvswitch-agent should be running but is not
+ Misconfiguration of openstack-origin causes obfuscated charm behaviour
James Page (james-page) wrote :

Currently, if the value of openstack-origin is not recognised, then the code will pattern match openstack release names, hence the behaviour for liberty, not mitaka as intended.

Pascal Mazon (pascal-mazon-u) wrote :

Alright, it makes sense!

Thank you James, I'll get Mitaka running with "distro" origin.

James Page (james-page) on 2016-08-03
Changed in neutron-gateway (Juju Charms Collection):
importance: Undecided → Medium
status: New → Triaged
milestone: none → 16.10
James Page (james-page) on 2016-10-14
Changed in neutron-gateway (Juju Charms Collection):
milestone: 16.10 → 17.01
James Page (james-page) on 2017-01-05
Changed in neutron-gateway (Juju Charms Collection):
importance: Medium → Low
James Page (james-page) on 2017-02-23
Changed in charm-neutron-gateway:
importance: Undecided → Low
status: New → Triaged
Changed in neutron-gateway (Juju Charms Collection):
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers