network-verification fails with more than 500 VLANs

Bug #1397140 reported by Łukasz Oleś
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Medium
Łukasz Oleś
6.0.x
Fix Committed
High
Łukasz Oleś
6.1.x
Won't Fix
Medium
Unassigned

Bug Description

When environment is configured to use neutron + vlan, user may set in "Neutron L2 settings" very big VLAN range. Even 4000 VLANs.

For such big amount of VLANs network verification will fail because of timeouts.

To fix this we need to increase timeouts in astute and in network_checker. Timeout values should be calcualated including VLANs and network interfaces number.

Places to change:
https://github.com/stackforge/fuel-astute/blob/master/lib/astute/network.rb#L118
https://github.com/stackforge/fuel-web/blob/master/network_checker/network_checker/net_check/api.py#L77
Currently 'duration' value is not sent, so protocol should be changed to include this value.

Łukasz Oleś (loles)
Changed in fuel:
assignee: nobody → Fuel Python Team (fuel-python)
assignee: Fuel Python Team (fuel-python) → Fuel Astute Team (fuel-astute)
summary: - network-verification fail with more than 500 VLANs
+ network-verification fails with more than 500 VLANs
tags: added: scale
Revision history for this message
Dima Shulyak (dshulyak) wrote :

Timeouts can be increased and it will probably fix problem for some time, but it is not really the solution.
Previously it worked fine with big number of vlans, like 4000. So i guess there was large number of nodes also,
and network checker failed because of large amount of traffic parsing.

We need to search for ways of decreasing amount of effective traffic or paralellizing parsing process.
One more thought that i had is to use lldp for basic network verification, but it requires design

Revision history for this message
Łukasz Oleś (loles) wrote :

As I wrote we don't need just increase timeout values but include number of VLANs and interfaces. So for 100 VLAN timeout for example should be like 100 * 0,2 and for 1000 like 1000*0.2.

Paralleling is part of another bug https://bugs.launchpad.net/fuel/+bug/1378500 I have working solution but can not test it in scale lab because of problems with deployment.

Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Guys, we could not prepare fix for this bug for 6.0.1. Looks like serious changes which need also good QA cycle.

tags: added: module-astute
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

As i know, network-verification plan to completely rewrite on 7.0. Moving it to 7.0

Changed in fuel:
milestone: 6.1 → 7.0
milestone: 6.1 → 7.0
no longer affects: fuel/7.0.x
Revision history for this message
Mike Scherbakov (mihgen) wrote :

Lukasz, can you update us on this bug? Did the parallelization done as part of bug #1378500 actually helped here?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/184256

Changed in fuel:
assignee: Fuel Astute Team (fuel-astute) → Łukasz Oleś (loles)
status: Triaged → In Progress
Revision history for this message
Łukasz Oleś (loles) wrote :

Mike no, it doesn't help here. You can easily reproduce it with 2 nodes. Just set 4000 VLANs for Neutron L2 settings and it will fail.

Now, when I know much more about network checker I fixed it. I think it's safe to merge it in 6.1. After review of course.

Mike Scherbakov (mihgen)
tags: added: release-notes
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/184256
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=6f8ff912a2e9d5eb1cca2aa5bda9dc09f5e0f8e2
Submitter: Jenkins
Branch: master

commit 6f8ff912a2e9d5eb1cca2aa5bda9dc09f5e0f8e2
Author: Łukasz Oleś <email address hidden>
Date: Tue May 19 14:08:51 2015 +0000

    Make sure that all VLANs will be checked at least once

    Change-Id: I5ea5a0c4d141072cfce058df141c8846cc9a327a
    Closes-bug: #1397140

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-docs (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/223478

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-docs (master)

Change abandoned by Evgeny Konstantinov (<email address hidden>) on branch: master
Review: https://review.openstack.org/223478
Reason: Invalid

tags: added: on-verification
Revision history for this message
Dmitriy Kruglov (dkruglov) wrote :

Verified on MOS 7.0, custom ISO. The issue is not reproduced.

ISO info:
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "1260"
  build_id: "2015-10-09_12-02-12"
  nailgun_sha: "edbae54d510edbaa1d379e9523febe5a0e5acd41"
  python-fuelclient_sha: "486bde57cda1badb68f915f66c61b544108606f3"
  fuel-agent_sha: "50e90af6e3d560e9085ff71d2950cfbcca91af67"
  fuel-nailgun-agent_sha: "d7027952870a35db8dc52f185bb1158cdd3d1ebd"
  astute_sha: "6c5b73f93e24cc781c809db9159927655ced5012"
  fuel-library_sha: "713698e88c6e1e4ed9ebad759a21266890898d57"
  fuel-ostf_sha: "2cd967dccd66cfc3a0abd6af9f31e5b4d150a11c"
  fuelmain_sha: "a65d453215edb0284a2e4761be7a156bb5627677"

Changed in fuel:
status: Fix Committed → Fix Released
tags: removed: on-verification
tags: added: wontfix-feature
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.