Mageia Linux shorewall prevents installation: 01_config_private_network.sh fails

Bug #1807886 reported by Yannick LE NY
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openstack-training-labs
New
Undecided
Roger Luethi

Bug Description

Hello,

The script autostart/01_config_private_network.sh does not works in the VM controller.
The console output report a ssh problem.

Software components :
* Openstack Training Labs Rocky (August 2018) (Rocky For Linux here : https://docs.openstack.org/training_labs/ )
* Mageia 6 linux distro with all the updates ( https://www.mageia.org/en/ )
* Virtualbox 5.2.22
* ssh : OpenSSH_7.5p1, OpenSSL 1.0.2q 20 Nov 2018
* iso file used by Openstack Training Labs Rocky : ubuntu-18.04.1-server-amd64.iso

I have the following errors :

In the console output :
...
INFO Starting VM controller with headless GUI
.INFO Waiting for ssh server in VM controller to respond at 127.0.0.1:2230.
.............INFO Connected to ssh server.
.....INFO Start autostart/00_config_public_network.sh
................................................................................
...............................................................INFO done
.INFO Start autostart/01_config_private_network.sh
.........................................................................
........ERROR ssh returned status 1.
ERROR Script failure: 01_config_private_network
ERROR Script failed. Exiting.
[user@localhost labs]$

There are more details in these logs files :

1)

in logs\035_01_config_private_network.auto file :
....
Waiting for interface qr-* in router namespace...qr-2ec82bbf-b8@if12
Setting a gateway on the public network on the router.
Waiting for interface qg-* in router namespace....qg-f98def78-de@if13
Listing network namespaces.
qrouter-5419dd26-812f-44f3-bc40-fa9f890314e0 (id: 2)
qdhcp-fb772cd5-4a43-45b9-af27-b8a6b6748076 (id: 1)
qdhcp-8bfcc377-7117-406f-92e2-95e56e93b07e (id: 0)
Sourcing the admin credentials.
Getting the router's IP address in the public network.
openstack port list --router router
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------+--------+
| ID | Name | MAC Address | Fixed IP Addresses | Status |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------+--------+
| 2ec82bbf-b85c-4790-a1ad-d3e1f51c0f52 | | fa:16:3e:c0:e8:92 | ip_address='172.16.1.1', subnet_id='b3e76aae-42cd-4f57-b9f2-14df1ea6c522' | ACTIVE |
| f98def78-deb3-4fd4-a78f-f58c3fa36648 | | fa:16:3e:25:0f:dd | ip_address='203.0.113.125', subnet_id='2012f902-1244-41cb-8e61-44354fe71981' | BUILD |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------+--------+
Waiting for ping reply from public router IP (203.0.113.125)....................ERROR No reply from public router IP in 20 seconds, aborting.

2)

in logs\stacktrain.log file :

5560 08:28:19.653 stacktrain.core.autostart INFO Start autostart/00_config_public_network.sh
5560 08:28:19.653 stacktrain.core.ssh DEBUG vm_ssh: ssh -q -i /home/user/openstack/training-labs/labs/lib/osbash-ssh-keys/osbash_key -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=10 -o ControlPath=none -p 2230 osbash@127.0.0.1 bash autostart/00_config_public_network.sh && rm -vf autostart/00_config_public_network.sh
5560 08:28:19.654 stacktrain.core.ssh DEBUG Writing live log for ssh call at /home/user/openstack/training-labs/labs/log/034_00_config_public_network.auto.
5560 08:30:43.693 stacktrain.core.autostart INFO done
5560 08:30:43.693 stacktrain.core.ssh DEBUG vm_ssh: ssh -q -i /home/user/openstack/training-labs/labs/lib/osbash-ssh-keys/osbash_key -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=10 -o ControlPath=none -p 2230 osbash@127.0.0.1 mkdir -p autostart
5560 08:30:44.182 stacktrain.core.ssh DEBUG Copying from
 /home/user/openstack/training-labs/labs/autostart/01_config_private_network.sh
 to
 osbash@127.0.0.1:autostart (port: 2230)
5560 08:30:44.626 stacktrain.core.autostart INFO Start autostart/01_config_private_network.sh
5560 08:30:44.627 stacktrain.core.ssh DEBUG vm_ssh: ssh -q -i /home/user/openstack/training-labs/labs/lib/osbash-ssh-keys/osbash_key -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ConnectTimeout=10 -o ControlPath=none -p 2230 osbash@127.0.0.1 bash autostart/01_config_private_network.sh && rm -vf autostart/01_config_private_network.sh
5560 08:30:44.627 stacktrain.core.ssh DEBUG Writing live log for ssh call at /home/user/openstack/training-labs/labs/log/035_01_config_private_network.auto.
5560 08:32:05.766 stacktrain.core.ssh ERROR ssh returned status 1.
5560 08:32:05.766 stacktrain.core.autostart ERROR Script failure: 01_config_private_network
18320 08:32:05.820 stacktrain.core.autostart ERROR Script failed. Exiting.

Can you fix this problem ?

Thank you.

Revision history for this message
Yannick LE NY (yleny) wrote :
Revision history for this message
Yannick LE NY (yleny) wrote :
Roger Luethi (rl-o)
Changed in labs:
assignee: nobody → Roger Luethi (rl-o)
Revision history for this message
Yannick LE NY (yleny) wrote :

I increased the value for $cnt variable, at the line 189 in the file scripts\config_private_network.sh, from 20 to 600 seconds but I have the same problem.

Manually I have no ping reply from public router IP (203.0.113.125).

Revision history for this message
Yannick LE NY (yleny) wrote :

Hello Roger,

Do you need more information (software tools release number, logs file, etc ...) from me for the problem ?

Can I do other tests to help you to find the problem ?

Best Regards

Thank you.

Revision history for this message
Roger Luethi (rl-o) wrote :

I have tried and, so far, failed to reproduce the problem. I have downloaded the code from http://tarballs.openstack.org/training-labs/dist/labs-stable-rocky.tgz, did not change anything about the configuration and let it run ("./st.py -b base", "./st.py -b cluster").

The test system was Ubuntu LTS 18.04.1 with VirtualBox 5.2.22.

I am a bit puzzled.

Revision history for this message
Yannick LE NY (yleny) wrote :

For me the steps are :

$ cd /home/user/openstack/training-labs/
$ wget http://tarballs.openstack.org/training-labs/dist/labs-stable-rocky.tgz
$ tar xzvf labs-stable-rocky.tgz
$ cd labs
$ ./st.py -b cluster (like as in the documentation here : https://wiki.openstack.org/wiki/Documentation/training-labs#Building_the_cluster )

> ("./st.py -b base", "./st.py -b cluster").
But I did not use the "./st.py -b base" command line BEFORE "./st.py -b cluster" command line.
Is it important ?

Revision history for this message
Roger Luethi (rl-o) wrote :

Sorry, I missed your comment. No, the "base" command (creating a basedisk) is implied in the "cluster" command (if the basedisk is missing, it gets built automatically). I have a habit of running "base" for testing because it recreates the basedisk which may help find errors created by changes in the upstream software repositories.

Have you tried Queens instead of Rocky? This would allow us to rule out the new Bionic image as the problem.

Revision history for this message
Yannick LE NY (yleny) wrote :

Hello Roger,

1)
> Have you tried Queens instead of Rocky?
Yes, Queens is worst than Rocky because python scripts have a lot of errors.

2)
I found the problem.
It is the firewall.

With Mageia linux distro, the firewall Shorewall is enabled by default.
In the Mageia Control Center (like the MS Windows Configuration panel) , in the system service and daemon part, I have disabled the shorewall and shorewall6 services/daemons.

----

And now the command line "./st.py -b cluster" works fine :

INFO config_public_network.sh -> 00_config_public_network.sh
INFO config_private_network.sh -> 01_config_private_network.sh
INFO Starting VM controller with headless GUI
.INFO Waiting for ssh server in VM controller to respond at 127.0.0.1:2230.
.............INFO Connected to ssh server.
....INFO Start autostart/00_config_public_network.sh
.......................................................................................................................................................INFO done
.INFO Start autostart/01_config_private_network.sh
.................................................................INFO done
INFO Processing of scripts successful.
INFO Shutting down VM controller.
INFO Waiting for shutdown of VM controller.
...............................................................................................INFO Machine powered off.
INFO Shutting down VM compute1.
INFO Waiting for shutdown of VM compute1.
...............INFO Machine powered off.
INFO Starting VM compute1 with headless GUI
.INFO Waiting for ssh server in VM compute1 to respond at 127.0.0.1:2232.
............INFO Connected to ssh server.
.INFO Processing of scripts successful.
INFO Starting VM controller with headless GUI
.INFO Waiting for ssh server in VM controller to respond at 127.0.0.1:2230.
............INFO Connected to ssh server.
..INFO Processing of scripts successful.
INFO Cluster build took 1486 seconds

ssh login for compute1 and controller is OK
login in Horizon web gui is OK

-----

In the OpenStack training labs source code, is it possible to add an additional reminder script with requirements and checkings for OpenStack training labs that is launched at the beginning by st.py python script ?
If prerequisites are missing, it show warnings with the things to fix.

Best regards.

Revision history for this message
Roger Luethi (rl-o) wrote :

Hi Yannick

Thank you for your response. This is certainly an interesting data point. I am not sure how to check for problematic firewall rules in a generic way (other than doing the check that produced the error for you). I will have to look into the way Mageia configures shorewall. Maybe we could just add a hint regarding firewall in the error message you encountered. Thanks again.

Roger Luethi (rl-o)
summary: - VM Controller's Script 01_config_private_network.sh in Openstack
- Training Labs Rocky does not work because ssh problem
+ Mageia Linux shorewall prevents installation:
+ 01_config_private_network.sh fails
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.