error deploy ceph-osd, compute on nodes

Bug #1400621 reported by progserega
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Fuel Library (Deprecated)
5.1.x
Invalid
High
Dmitry Borodaenko
6.0.x
Invalid
High
Fuel Library (Deprecated)
6.1.x
Invalid
High
Dmitry Borodaenko

Bug Description

Deploy ceph-osd, compute on nodes faild on ubuntu, on centos in fuel 5.1.

Revision history for this message
progserega (progserega-n) wrote :
Revision history for this message
progserega (progserega-n) wrote :
Revision history for this message
progserega (progserega-n) wrote :
Revision history for this message
progserega (progserega-n) wrote :

I try use cecntos, try without HA - result allways one - failed deploy.

Revision history for this message
progserega (progserega-n) wrote :

On 6.0 tech preview no this bug on such configuration.

Revision history for this message
Sergey Vasilenko (xenolog) wrote :

looks like ssh keys didn't distributed properly or at time.

Changed in fuel:
milestone: none → 5.1.2
importance: Undecided → High
assignee: nobody → Fuel Library Team (fuel-library)
status: New → Confirmed
Changed in fuel:
status: Confirmed → Incomplete
status: Incomplete → Confirmed
Revision history for this message
Sergey Vasilenko (xenolog) wrote :

Tatyanka_Leontov
17:35 xenolog: It is too small details in the issue, but anyway, I've deployed today 5.1.1 - 48 with ceph on ubuntu (5 single ceph nodes - 1 ceph with controller + 1 with compute - on compute ) with neutron and bonding - and it works like a charm
Tatyanka_Leontov
17:36 also if we look at the our 5.1.1 testing table (where a lot of deployment passed) we do not find any problems there, so we need more details how to reproduce it on 5.1.1 and for now I prefer to move issue to incomplete state for 5.1.2

Revision history for this message
progserega (progserega-n) wrote :

Tell me please, where are I can get 5.1.2, or 5.1.1-48?

On https://wiki.openstack.org/wiki/Fuel says, that ISO I may download https://fuel-jenkins.mirantis.com/view/ISO/

But on ths links I found only 5.1.1-38... Today I will be try test this release...

Revision history for this message
progserega (progserega-n) wrote :
Download full text (3.4 KiB)

> looks like ssh keys didn't distributed properly or at time.
I try ssh to all nodes from fuel-server and to all nodes ssh success by ssh-key without password. But ssh from node to fuel-server - sayd that a must enter password:

[root@fuel ~]#
[root@fuel ~]# fuel node
id | status | name | cluster | ip | mac | roles | pending_roles | online
---|--------|------------------|---------|-----------|-------------------|-------------------|---------------|-------
2 | ready | Untitled (ec:54) | 1 | 10.20.0.4 | 0e:05:aa:81:ab:4b | controller | | True
1 | error | Untitled (99:b4) | 1 | 10.20.0.3 | 92:15:33:89:f8:42 | ceph-osd, compute | | True
3 | error | Untitled (15:74) | 1 | 10.20.0.5 | a6:50:db:48:85:47 | ceph-osd, compute | | True
[root@fuel ~]# ssh root@10.20.0.4
Warning: Permanently added '10.20.0.4' (RSA) to the list of known hosts.
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.11.0-18-generic x86_64)

 * Documentation: https://help.ubuntu.com/
Last login: Tue Dec 9 08:26:47 2014 from 10.20.0.2
root@node-2:~# logout
Connection to 10.20.0.4 closed.
[root@fuel ~]# ssh root@10.20.0.3
Warning: Permanently added '10.20.0.3' (RSA) to the list of known hosts.
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.11.0-18-generic x86_64)

 * Documentation: https://help.ubuntu.com/
New release '14.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Tue Dec 9 08:26:37 2014 from 10.20.0.2
root@node-1:~# logout
Connection to 10.20.0.3 closed.
[root@fuel ~]# ssh root@10.20.0.5
Warning: Permanently added '10.20.0.5' (RSA) to the list of known hosts.
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.11.0-18-generic x86_64)

 * Documentation: https://help.ubuntu.com/
New release '14.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Tue Dec 9 08:26:52 2014 from 10.20.0.2
root@node-3:~# logout
Connection to 10.20.0.5 closed.
[root@fuel ~]# ssh root@10.20.0.3
Warning: Permanently added '10.20.0.3' (RSA) to the list of known hosts.
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.11.0-18-generic x86_64)

 * Documentation: https://help.ubuntu.com/
New release '14.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Tue Dec 9 22:17:59 2014 from 10.20.0.2
root@node-1:~# ssh root^C
root@node-1:~# logout
Connection to 10.20.0.3 closed.
[root@fuel ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:04:23:D2:EE:5E
          inet addr:10.20.0.2 Bcast:10.20.0.255 Mask:255.255.255.0
          inet6 addr: fe80::204:23ff:fed2:ee5e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:1181325 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1753247 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:269790531 (257.2 MiB) TX bytes:1957243053 (1.8 GiB)

[root@fuel ~]# ssh root@10.20.0.3
Warning: Permanently added '10.20.0.3' (RSA) to the list of known hosts.
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.11.0-18-generic x86_64)

 * Documentation: https://help.ubuntu.com/
New release '14.04.1 LTS'...

Read more...

Revision history for this message
progserega (progserega-n) wrote :

I found such problem on fuel-5.1.1

Revision history for this message
progserega (progserega-n) wrote :

In logs I found:
Wed Dec 10 06:59:03 +0000 2014 /Stage[main]/Ceph::Conf/Exec[ceph-deploy config pull]/returns (notice): ssh: connect to host node-1 port 22: No route to host
Wed Dec 10 06:59:03 +0000 2014 /Stage[main]/Ceph::Conf/Exec[ceph-deploy config pull]/returns (notice): [ceph_deploy.config][ERROR ] Unable to pull /etc/ceph/ceph.conf from node-1
Wed Dec 10 06:59:03 +0000 2014 /Stage[main]/Ceph::Conf/Exec[ceph-deploy config pull]/returns (notice): [ceph_deploy][ERROR ] GenericError: Failed to fetch config from 1 hosts
Wed Dec 10 06:59:03 +0000 2014 Puppet (err): ceph-deploy --overwrite-conf config pull node-1 returned 1 instead of one of [0]

What this mean?

Fuel (deploy-server) try connect to host "node-1"? node-1 - is domain name of host? It must be resolving?
[root@fuel ~]# ping node-1
ping: unknown host node-1

But
[root@fuel ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:04:23:D2:EE:5E
          inet addr:10.20.0.2 Bcast:10.20.0.255 Mask:255.255.255.0
          inet6 addr: fe80::204:23ff:fed2:ee5e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:5120626 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2109147 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1083947918 (1.0 GiB) TX bytes:2815177366 (2.6 GiB)

[root@fuel ~]# cat /etc/resolv.conf
search rs.int
domain rs.int
nameserver 10.20.0.2

But if problem in resolve domain name - why "controller node" deploy success, but only deploy "compute node" is failed?

Revision history for this message
Stanislaw Bogatkin (sbogatkin) wrote :

For 6.1 branch this bug was incomplete for more than 4 weeks. We cannot investigate it further so we are setting the status to Invalid. If you think it is not correct, please feel free to provide requested information and reopen the bug, and we will look into it further.

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Invalid for 5.1.1-updates as well - not enough info to reproduce

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.