Plumgrid failed to create pap

Bug #1637649 reported by Ashley Lai
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
plumgrid-director (Juju Charms Collection)
New
Undecided
Unassigned

Bug Description

OIL see some passing pipelines with Plumgrid but occasionally plumgrid failed to create pap. There are some errors in the neutron log attached.

2016-10-28 20:02:52,983 [INFO] oil_ci.prepare.network: pap cmd = . /home/ubuntu/envrc.sh; neutron physical-attachment-point-create plumgrid_pap1 --interface hostname=pullman-03,interface_name=eth1
Connection to 10.245.1.61 closed.
2016-10-28 20:03:00,763 [ERROR] oil_ci.prepare.oil_prepper: Failed to prepare deployed cloud
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/oil_ci/prepare/oil_prepper.py", line 160, in do_prep
    net = self.prepare_network()
  File "/usr/lib/python2.7/dist-packages/oil_ci/prepare/oil_prepper.py", line 92, in prepare_network
    self._env)
  File "/usr/lib/python2.7/dist-packages/oil_ci/prepare/network.py", line 337, in prepare_neutron_network
    output = juju2.run_remote_cmd(environment, 'neutron-api/0', cmd)
  File "/usr/lib/python2.7/dist-packages/oil_ci/juju/juju2.py", line 170, in run_remote_cmd
    return check_output(cmd)
  File "/usr/lib/python2.7/subprocess.py", line 574, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['juju-2.0', 'ssh', '-m', u'maas:ci-oil-slave10', 'neutron-api/0', '. /home/ubuntu/envrc.sh; neutron physical-attachment-point-create plumgrid_pap1 --interface hostname=pullman-03,interface_name=eth1']' returned non-zero exit status 1
2016-10-28 20:03:00,785 [INFO] oil_ci.prepare.oil_prepper: Saving preparation artifacts to: ./artifacts
tar: Removing leading `/' from member names
tar: /var/log/juju/machine-0.log: file changed as we read it
tar: /var/log/juju/logsink.log: file changed as we read it
Connection to 10.245.0.170 closed.

https://oil-jenkins.canonical.com/job/pipeline_prepare/604949/

Revision history for this message
Ashley Lai (alai) wrote :
Revision history for this message
Junaid Ali (junaidali) wrote :

Hi Ashley,

Thanks for providing me all necessary information and logs.

In the neutron-server logs (/var/log/neutron-server.log), there is an error message @ Line 3942:
PLUMgridException: PLUMgrid Plugin Error: No Physical Interface/Gateway found in PLUMgrid nodes

wich says that the interface provided doesn't exist on gateway node/or a case, which shouldn't happen, gateway node is not deployed.
So, I assume that the issue occurred because the value provided by pipeline in external-interfaces config of plumgrid-gateway charm isn't correct. You can reproduce this issue by specifying a non-existent interface as external-interface and run pap command.
The issue will be solved if correct interface is set and re-run the pap command.

Changed in plumgrid-director (Juju Charms Collection):
status: New → Invalid
Revision history for this message
Junaid Ali (junaidali) wrote :

The issue can also hit when we provide the wrong hostname while creating physical attachment point.

So, there can be three causes to this issue if the plumgrid-gateway is deployed:

   - Wrong external-interfaces config value is set while deploying the charm.
   - Wrong interface name provided while creating the physical attachment point.
   - Wrong hostname provided while creating the physical attachment point.

Revision history for this message
Ashley Lai (alai) wrote :

Thanks Junaid for looking into it. I've seen several more pipelines failures with the same error. I've checked the server that we deployed the gateway and eth1 does exist. I will pull one of the servers that are failing and manually deploy to see if we can recreate.

Revision history for this message
Ashley Lai (alai) wrote :

For my reference only: prod prepare 605455

Junaid Ali (junaidali)
Changed in plumgrid-director (Juju Charms Collection):
status: Invalid → New
Revision history for this message
Ashley Lai (alai) wrote :

From the debug session with Junaid, the issue is the MTU size. From the plumgrid-gateway node it cannot ping the plumgrid-director node with MTU size 1580 (1580 is the requirements).

pomeroy is the plumgrid-gateway node.

# ping plumgrid-director node with 1552 does not work
ubuntu@pomeroy:/var/lib/plumgrid/plumgrid-data/conf/pg$ ping -c 4 -s 1552 -M do 10.245.1.102
PING 10.245.1.102 (10.245.1.102) 1552(1580) bytes of data.

# ping plumgrid-director node with 1452 works
ubuntu@pomeroy:/var/lib/plumgrid/plumgrid-data/conf/pg$ ping -c 4 -s 1452 -M do 10.245.1.102
PING 10.245.1.102 (10.245.1.102) 1452(1480) bytes of data.
1460 bytes from 10.245.1.102: icmp_seq=1 ttl=64 time=0.406 ms
1460 bytes from 10.245.1.102: icmp_seq=2 ttl=64 time=0.399 ms
1460 bytes from 10.245.1.102: icmp_seq=3 ttl=64 time=0.382 ms
1460 bytes from 10.245.1.102: icmp_seq=4 ttl=64 time=0.331 ms

ubuntu@pomeroy:/var/lib/plumgrid/plumgrid-data/conf/pg$ cat ifcs.conf
br-eth0 = fabric_core host
eth1 = access_phys
ubuntu@pomeroy:/var/lib/plumgrid/plumgrid-data/conf/pg$ cat plumgrid.conf
plumgrid_ip=10.245.1.102 #plumgrid-director IP
plumgrid_port=8001
mgmt_dev=br-eth0
label=pomeroy
plumgrid_rsync_port=2222
plumgrid_rest_addr=0.0.0.0:9180
fabric_mode=host
plumgrid_syslog_ng_ip=
plumgrid_syslog_ng_port=
plumgrid_monitor_interval=
start_plumgrid_iovisor=yes
start_plumgrid=`/opt/pg/scripts/pg_is_director.sh $plumgrid_ip`
location=

From plumgrid-edge node, we are able to ping plumgrid-director node:
ubuntu@sirrush:~$ ping -c 4 -s 1552 -M do 10.245.1.102
PING 10.245.1.102 (10.245.1.102) 1552(1580) bytes of data.
1560 bytes from 10.245.1.102: icmp_seq=1 ttl=64 time=0.350 ms
1560 bytes from 10.245.1.102: icmp_seq=2 ttl=64 time=0.371 ms
1560 bytes from 10.245.1.102: icmp_seq=3 ttl=64 time=0.378 ms

Revision history for this message
Ashley Lai (alai) wrote :

RT 98567 was opened to increase MTU size.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.