Duplicate IPs during deploy on 90 nodes env

Bug #1378000 reported by Sergey Galkin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Łukasz Oleś
5.1.x
Fix Committed
High
Łukasz Oleś
Mitaka
Invalid
Undecided
Fuel Sustaining

Bug Description

api: '1.0'
astute_sha: f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13
auth_required: true
build_id: 2014-10-02_14-17-49
build_number: '17'
feature_groups:
- mirantis
fuellib_sha: 46ad455514614ec2600314ac80191e0539ddfc04
fuelmain_sha: ce6a2871734bb40e09a6f61e9d007bb7e324fada
nailgun_sha: eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d
ostf_sha: 64cb59c681658a7a55cc2c09d079072a41beb346
production: docker
release: 5.1.1

Steps to reproduce
1. Start deploy cluster with Murano, Sahara, Ceilometer with HA on 3 controllers and computes with cinder lvm roles on Centos for cluster with 50 nodes from 90 nodes summary.

Deploy will fail with several offline compute nodes because some nodes have the same IPs

In my case

[root@fuel ~]# fuel nodes | grep '10.20.0.3 '
2 | deploying | compute_6 | 1 | 10.20.0.3 | 0c:c4:7a:1d:97:7c | cinder, compute | | False
29 | discover | Untitled (92:da) | None | 10.20.0.3 | 0c:c4:7a:1d:92:da | | | True
87 | discover | Untitled (92:58) | None | 10.20.0.3 | b2:3c:9d:8f:fc:46 | | | True

[root@fuel ~]# fuel nodes | grep '10.20.0.6 '
10 | deploying | compute_2 | 1 | 10.20.0.6 | 0c:c4:7a:1d:93:bc | cinder, compute | | True
23 | discover | Untitled (92:7a) | None | 10.20.0.6 | 0c:c4:7a:1d:92:7a | | | True

Snapshot attached

Tags: scale
Revision history for this message
Sergey Galkin (sgalkin) wrote :
Changed in fuel:
milestone: none → 6.0
Changed in fuel:
assignee: nobody → Fuel Python Team (fuel-python)
importance: Undecided → High
Revision history for this message
Sergey Galkin (sgalkin) wrote :
Revision history for this message
Sergey Galkin (sgalkin) wrote :
Download full text (12.6 KiB)

Increase PXE network to /16 do not helps.

eth0 Link encap:Ethernet HWaddr 52:54:00:C1:D8:EE
          inet addr:10.20.0.2 Bcast:10.20.255.255 Mask:255.255.0.0
          inet6 addr: fe80::5054:ff:fec1:d8ee/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:56963168 errors:0 dropped:0 overruns:0 frame:0
          TX packets:45493126 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:5669525789 (5.2 GiB) TX bytes:76746845514 (71.4 GiB)

[root@fuel ~]# fuel nodes | grep 10.20.1.188
93 | error | compute_23 | 1 | 10.20.1.188 | 0c:c4:7a:1d:92:7a | cinder, compute | | False
29 | discover | Untitled (92:ae) | None | 10.20.1.188 | 0c:c4:7a:1d:92:ae | | | True

[root@fuel ~]# fuel nodes
id | status | name | cluster | ip | mac | roles | pending_roles | online
---|-------------|------------------|---------|-------------|-------------------|-------------------|---------------|-------
20 | provisioned | compute_13 | 1 | 10.20.0.14 | 0c:c4:7a:1d:92:ba | cinder, compute | | True
82 | error | compute_28 | 1 | 10.20.0.46 | 0c:c4:7a:1d:f3:5e | cinder, compute | | True
25 | provisioned | compute_29 | 1 | 10.20.0.17 | 0c:c4:7a:1d:92:68 | cinder, compute | | True
26 | provisioned | compute_12 | 1 | 10.20.0.18 | 0c:c4:7a:1d:92:a8 | cinder, compute | | True
27 | provisioned | compute_22 | 1 | 10.20.0.19 | 0c:c4:7a:1d:91:64 | cinder, compute | | True
93 | error | compute_23 | 1 | 10.20.1.188 | 0c:c4:7a:1d:92:7a | cinder, compute | | False
11 | provisioned | compute_20 | 1 | 10.20.0.7 | 00:25:90:eb:de:90 | cinder, compute | | True
39 | provisioned | compute_21 | 1 | 10.20.0.27 | 0c:c4:7a:1d:92:c8 | cinder, compute | | True
17 | provisioned | compute_26 | 1 | 10.20.0.11 | 0c:c4:7a:1d:92:9a | cinder, compute | | True
66 | error | compute_27 | 1 | 10.20.0.38 | 0c:c4:7a:1d:92:a4 | cinder, compute | | True
89 | error | compute_24 | 1 | 10.20.1.81 | 0c:c4:7a:1d:f1:d4 | cinder, compute | | False
33 | provisioned | compute_25 | 1 | 10.20.0.23 | 0c:c4:7a:1d:92:d6 | cinder, compute | | True
57 | provisioned | compute_9 | 1 | 10.20.0.35 | 0c:c4:7a:1d:f6:42 | cinder, compute | | True
31 | provisioned | compute_8 | 1 | 10.20.0.21 | 0c:c4:7a:1d:94:00 | cinder, compute | | True
34 | provisioned | compute_3 | 1 | 10.20.0.24 | 0c:c4:7a:1d:91:4c | cinder, compute | | True
12 | provisioned | compute_2 | 1 | 10.20.0.8 | 00:25:90:eb:d8:06 | cinder, compute | | True
10 | provisioned | compute_1 | 1 | 10.20.0.6 | 0c:c4:7a:1d:95:7a | ci...

Revision history for this message
Dima Shulyak (dshulyak) wrote :

Hi, is it possible to get access to live environment with this problem?
Or atleast can you provide next information:
1. dnsmasq leases
2. all cobbler profiles

Revision history for this message
Dima Shulyak (dshulyak) wrote :

Problem is on the side of nailgun. Somehow nailgun sends several nodes with same ipaddress.
Are you using build with custom patches or this is what we have in master?

Changed in fuel:
status: New → Confirmed
status: Confirmed → Triaged
Revision history for this message
Aleksey Kasatkin (alekseyk-ru) wrote :

Nailgun commit is head in stable/5.1.

Dima Shulyak (dshulyak)
Changed in fuel:
status: Triaged → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/127946

Revision history for this message
Łukasz Oleś (loles) wrote :

It looks like this MAC addresses give conflicts:
0c:c4:7a:1d:91:64
0c:c4:7a:1d:93:da
0c:c4:7a:1d:90:fe
0c:c4:7a:1d:92:76

Changed in fuel:
status: Confirmed → In Progress
assignee: Fuel Python Team (fuel-python) → Łukasz Oleś (loles)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/127946
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=de060cc37f232870bf1f699b3973fc425e9abe05
Submitter: Jenkins
Branch: master

commit de060cc37f232870bf1f699b3973fc425e9abe05
Author: Łukasz Oleś <email address hidden>
Date: Sun Oct 12 22:05:54 2014 +0200

    Add dhcp-sequential-ip option to dnsmasq

    For many simultaneously DHCPDISCOVER requests dnsmasq
    can offer the same IP for two different MAC addresses.
    This option prevents it by assigning IPs one by one
    instead of using hashing algorithm.

    Change-Id: Iff3c42d21e1f1c09cb9eab5f07dbb066508dcb56
    Related-bug: 1378000
    Related-bug: 1376680
    Related-bug: 1379917
    Blueprint: 100-nodes-support

Łukasz Oleś (loles)
Changed in fuel:
status: In Progress → Fix Committed
Łukasz Oleś (loles)
Changed in fuel:
status: Fix Committed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-library (stable/5.1)

Related fix proposed to branch: stable/5.1
Review: https://review.openstack.org/128611

Łukasz Oleś (loles)
Changed in fuel:
status: In Progress → Fix Committed
tags: added: scale
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/128611
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=1a785608f7b45af20a165b7cae1a5e2f0a4d63e0
Submitter: Jenkins
Branch: stable/5.1

commit 1a785608f7b45af20a165b7cae1a5e2f0a4d63e0
Author: Łukasz Oleś <email address hidden>
Date: Sun Oct 12 22:05:54 2014 +0200

    Add dhcp-sequential-ip option to dnsmasq

    For many simultaneously DHCPDISCOVER requests dnsmasq
    can offer the same IP for two different MAC addresses.
    This option prevents it by assigning IPs one by one
    instead of using hashing algorithm.

    Change-Id: Iff3c42d21e1f1c09cb9eab5f07dbb066508dcb56
    Related-bug: 1378000
    Related-bug: 1376680
    Related-bug: 1379917
    Blueprint: 100-nodes-support

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Still getting duplicated IPs for a couple of nodes. Fuel 9.0 (9.1)

[root@fuel ~]# grep 10.21.8.171 fresh.node.list
| 1078 | Untitled (d1:28) | discover | ubuntu | [] | 10.21.8.171 | 52:54:00:6c:d1:28 | None | Standard PC (i440FX + PIIX, 1996) | False |
| 39 | Untitled (39:04) | discover | ubuntu | [] | 10.21.8.171 | 52:54:00:2a:39:04 | None | Standard PC (i440FX + PIIX, 1996) | True |
[root@fuel ~]#

dnsmasq.conf contains dhcp-sequential-ip option. Perhaps, it doesn't work with more than 1000 nodes.

There's more than 200Gb of logs, i can't attach all of them. The env is still online, but we need to react ASAP. No idea, how long it would be kept.

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Marked as Invalid, created new one instead -> https://bugs.launchpad.net/fuel/+bug/1630299

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.