SSH / SCP to VM failed using NAMESPACE while IP is assigned by Horizon

Bug #1849221 reported by Ricardo Perez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Medium
YaoLe

Bug Description

Brief Description
-----------------
In an External Storage Configuration (2+2+2), while using the NAMESPACE to perform SSH or SCP from the compute to the VM, this is no possible. "Temporary failure in name resolution" error message is seen in the terminal, while Horizon shows "no bootable device" error message on "VM Console Tab"

Severity
--------
Provide the severity of the defect.
<Critical: System/Feature is not usable due to the defect>

Steps to Reproduce
------------------

1.- Follow the steps described here to set up a Duplex Configuration:
https://wiki.openstack.org/w/index.php?title=StarlingX/Containers/InstallationOnStandardStorage&oldid=171102

2.- Add the following property to the flavor that you are going to use to create VMs:
openstack flavor list
openstack flavor show <Specifc_Flavor_for_VM_Creation>
openstack flavor set <Flavor_ID> --property hw:mem_page_size=large

3.- Create an image
openstack mage create --container-format bare --disk-format qcow2 --file cirros-0.4.0-x86_64-disk.img cirros

4.- Create a security group
openstack security group create security1
openstack security group rule create --ingress --protocol icmp --remote-ip 0.0.0.0/0 security1
openstack security group rule create --ingress --protocol tcp --remote-ip 0.0.0.0/0 security1
openstack security group rule create --ingress --protocol udp --remote-ip 0.0.0.0/0 security1

5.- Create a VM
openstack server create --image cirros --flavor m1_large --network public-net0 --security-group security1 richocirros4

5.- Perform the following commands to perform the NAMESPACE commands:

controller-0:~# IP=`openstack server list --name richo1 -f value -c Networks | awk '{ split($1, v, "="); print v[2]}'`

controller-0:~# NAMESPACE=$(ip netns | grep $(neutron net-list --name public-net0 -f value -c id))

controller-0:~# sudo ip netns exec $NAMESPACE scp <file_name> cirros@$IP:~/

Expected Behavior
------------------
You should be able to perform SSH / SCP using NAMESPACE after the VM is created and you see the IP address in Horizon.

Actual Behavior
----------------
"Temporary failure in name resolution" error message is seen in the terminal, while Horizon shows "no bootable device" error message on "VM Console Tab".

Reproducibility
---------------
<Reproducible/Intermittent/Seen once>
The issue is 100% reproducible

System Configuration
--------------------
<External Storage (2+2+2)>

Branch/Pull Time/Commit
-----------------------

controller-0:~# cat /etc/build.info
###
### StarlingX
### Built from master
###

OS="centos"
SW_VERSION="19.09"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20191018T013000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="285"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-10-18 01:30:00 +0000"

Last Pass
---------
STX 2.0 Regression testing, after get the advice of creating a security group, information can be seen in this bug: https://bugs.launchpad.net/starlingx/+bug/1837797

Timestamp/Logs
--------------
compute-1:~$ sudo virsh list
Password:
 Id Name State
-----------------------------------
 4 instance-00000017 running

compute-1:~$ sudo ip netns exec qdhcp-4c4f9541-c39f-43ce-b438-63156437db33 ping cirros@192.168.101.195
ping: cirros@192.168.101.96: Temporary failure in name resolution

compute-0:~$ free -h
              total used free shared buff/cache available
Mem: 187G 27G 155G 14M 4.7G 158G
Swap: 0B 0B 0B

Test Activity
-------------
[Regression Testing]

Revision history for this message
Ricardo Perez (richomx) wrote :
Revision history for this message
Ricardo Perez (richomx) wrote :
Revision history for this message
Ricardo Perez (richomx) wrote :

All nodes full log files.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Assigning to Yao Le to investigate as he looked at this the last time

tags: added: stx.distro.openstack stx.networking
Changed in starlingx:
assignee: nobody → YaoLe (yaole)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.3.0 until further investigation. This is failing a regression TC which passed in stx.2.0

tags: added: stx.3.0
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Elio Martinez (elio1979) wrote :

Following the steps described on this bug, i was able to ping instance from compute in a 2+2 configuration and duplex as well. the problem seems to be related to the port 22, having the following message
#ssh -i /home/sysadmin/.ssh/id_rsa.pub cirros@192.168.101.156
Unable to negotiate with 192.168.101.156 port 22: no matching key exchange method found. Their offer: diffie-hellman-group1-sha1,diffie-hellman-group14-sha1.

Revision history for this message
Elio Martinez (elio1979) wrote :

My WA so far should be add the cipher specifying the algorithm.

#ssh -i /home/sysadmin/.ssh/id_rsa.pub -oKexAlgorithms=+diffie-hellman-group1-sha1 -c 3des-cbc cirros@192.168.101.156
The authenticity of host '192.168.101.156 (192.168.101.156)' can't be established.
RSA key fingerprint is SHA256:Xxc2OEVpKJbPNY7JdodXPnF7Avk8f30xL7T62wRzaq0.
RSA key fingerprint is MD5:25:a1:4f:51:9c:63:86:47:e1:2a:a5:7a:87:7d:a1:b4.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.101.156' (RSA) to the list of known hosts.
$

Revision history for this message
Ricardo Perez (richomx) wrote :
Download full text (9.9 KiB)

Thanks to Elio's help, we have found a Work Around for the External configuration. Here are the steps:

Note: Avoid to use m1.tiny flavor, this flavor isn't working properly

1.- Follow the steps described here to set up a Duplex Configuration:
https://wiki.openstack.org/w/index.php?title=StarlingX/Containers/InstallationOnStandardStorage&oldid=171102

2.- Login into each one of the computes and create an SSH Keypair:
ssh-keygen -b 4096
with no passphrase, and no password, just the simple version (we haven't tried with passphrase or password).

3.- Copy the file located at each one of your computes at ~/.ssh/id_rsa.pub to your controller-0 (or the active one from where you will launch your VMs)

4.- Create a Keypair in your controller-0 (or the active one), using each one of the files that you already copied from your controllers, like:
controller-0:~# openstack keypair create --private-key /home/sysadmin/id_rsa_compute_0.pub richo_key_c0
+-------------+-------------------------------------------------+
| Field | Value |
+-------------+-------------------------------------------------+
| fingerprint | c8:01:93:a5:4b:a6:ea:c5:2e:6e:d1:31:59:a5:39:d0 |
| name | richo_key_c0 |
| user_id | 76813b4fa2bd497d9c3069487993d51e |
+-------------+-------------------------------------------------+
controller-0:~# openstack keypair create --private-key /home/sysadmin/id_rsa_compute_1.pub richo_key_c1
+-------------+-------------------------------------------------+
| Field | Value |
+-------------+-------------------------------------------------+
| fingerprint | 65:e7:c2:f5:a9:71:e9:ca:38:4e:4f:7e:0b:fc:ae:b4 |
| name | richo_key_c1 |
| user_id | 76813b4fa2bd497d9c3069487993d51e |
+-------------+-------------------------------------------------+

5.- Add the following property to the flavor that you are going to use to create VMs. Use any flavor different that m1.tiny:
openstack flavor list
openstack flavor show <Specifc_Flavor_for_VM_Creation>
openstack flavor set <Flavor_ID> --property hw:mem_page_size=large

6.- Create an image
openstack mage create --container-format bare --disk-format qcow2 --file cirros-0.4.0-x86_64-disk.img cirros

7.- Create a security group
openstack security group create security1
openstack security group rule create --ingress --protocol icmp --remote-ip 0.0.0.0/0 security1
openstack security group rule create --ingress --protocol tcp --remote-ip 0.0.0.0/0 security1
openstack security group rule create --ingress --protocol udp --remote-ip 0.0.0.0/0 security1

8.- Create a VM
controller-0:~# openstack server create --image cirros --flavor m1.tiny --network public-net0 --security-group security1 --key-name richo_key_c0 richo
+-------------------------------------+------------------------------------------------+
| Field | Value |
+-------------------------------------+------------------------------------------------+
| OS-DCF:diskConfig ...

Revision history for this message
Elio Martinez (elio1979) wrote :

With the following iso , IP's are not inside the instance as well, no matter the WA.

###
### StarlingX
### Built from master
###

OS="centos"
SW_VERSION="19.09"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20191101T013000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="303"
BUILD_HOST="starlingx_mirror"

Revision history for this message
Elio Martinez (elio1979) wrote :

All the configs has the same problem now

Revision history for this message
Le, Huifeng (hle2) wrote :

@Elio, I believe your issue had been resolved by https://bugs.launchpad.net/starlingx/+bug/1851414

@Ricardo Perez, with the workaround (under Elio's help), do you think this is still an issue for you? Thanks!

Revision history for this message
Ricardo Perez (richomx) wrote :

@Huifeng, it's still an issue, because, besides Horizon shows you an IP assigned to the VM, you should go inside the VM and add it manually.

After that, you might be able to do the ping / ssh / scp. However I believe this is a broken funcitonality, because I don't believe that the end user should go by each one of the created VM's to assign manually the IP shown by Horizon.

Revision history for this message
YaoLe (yaole) wrote :

@Ricardo Perez

Hi, are you using ovs-dpdk in your starlingX?

Could you catch some logs:
ip netns
ip netns exec $DHCP_NAMESPACE ifconfig
Run below command in VM:
dhclient $INTERFACE

YaoLe (yaole)
Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
Le, Huifeng (hle2) wrote :

@Ricardo Perez, as discussed in network team meeting, it is agreed that it is expected behavior that ssh connection is fail so this issue can be closed. Elio will work with you to refine the test case. thanks!

Changed in starlingx:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.