sysfs fails to start after setting cpu afinity >32 cpus

Bug #1461306 reported by Andrew Woodward
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
Critical
Eric Jacobs
6.0.x
Invalid
Critical
Fuel Library (Deprecated)

Bug Description

sysfs fails to start after the cpu affinity is set. When $::processorcount is grater than 32

https://review.openstack.org/#/c/184312/ added sysfs. but in some cases/hardware its set to 'ffffffffff' which causes sysfs to fail to start and the deployment fails.

The value for cpu_affinity' /var/log/puppet.log
2015-06-02 20:26:54 +0000 /Stage[main]/Main/Notify[The value for cpu_affinity_hex is ffffffffff] (info): Starting to evaluate the resource
2015-06-02 20:26:54 +0000 Puppet (notice): The value for cpu_affinity_hex is ffffffffff
2015-06-02 20:26:54 +0000 /Stage[main]/Main/Notify[The value for cpu_affinity_hex is ffffffffff]/message (notice): defined 'message' as 'The value for cpu_affinity_hex is ffffffffff'
2015-06-02 20:26:54 +0000 /Stage[main]/Main/Notify[The value for cpu_affinity_hex is ffffffffff] (debug): The container Class[Main] will propagate my refresh event
2015-06-02 20:26:54 +0000 /Stage[main]/Main/Notify[The value for cpu_affinity_hex is ffffffffff] (info): Evaluated in 0.00 seconds
2015-06-02 20:27:15 +0000 /Stage[main]/Main/Notify[The value for cpu_affinity_hex is ffffffffff] (info): Starting to evaluate the resource
2015-06-02 20:27:15 +0000 Puppet (notice): The value for cpu_affinity_hex is ffffffffff

Andrew Woodward (xarses)
Changed in fuel:
status: Confirmed → In Progress
assignee: nobody → Eric Jacobs (ejacobs)
description: updated
Revision history for this message
Dmitry Ilyin (idv1985) wrote :

ffffffffff means 40 cpus! Are there really 40 cpus?

Revision history for this message
Dmitry Ilyin (idv1985) wrote :

perhaps hyper threading or something. Maybe it would be better to use physicalprocessorcount instead of processor count

Revision history for this message
Dmitry Ilyin (idv1985) wrote :

And we should also exclude net interfaces i guess

Revision history for this message
Dmitry Ilyin (idv1985) wrote :

vnet interfaces created by compute's vas

Revision history for this message
Eric Jacobs (ejacobs) wrote :

Yes, there are really 40 cpus in /proc/cpuinfo. I'm not sure if it's hyperthreading or physical. I've seen boxes with 64 physical cores before. To get around this, I just discard all cpus >32 (for now).

Andrew Woodward (xarses)
summary: - sysfs fails to start after setting cpu afinity
+ sysfs fails to start after setting cpu afinity >32 cpus
description: updated
Revision history for this message
Alex Schultz (alex-schultz) wrote :
Changed in fuel:
assignee: Eric Jacobs (ejacobs) → Sergey Vasilenko (xenolog)
Changed in fuel:
assignee: Sergey Vasilenko (xenolog) → Eric Jacobs (ejacobs)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The 6.0.x looks like a good candidate for the backport, see https://bugs.launchpad.net/fuel/+bug/1401925

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Oh, my wrong - won't fix for the 6.0.x as related bug is actually https://bugs.launchpad.net/fuel/+bug/1456587 which is also won't fix for the 6.0.x

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/187801
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=87f429928fe6e91952698a34fa356bbb6918eee0
Submitter: Jenkins
Branch: master

commit 87f429928fe6e91952698a34fa356bbb6918eee0
Author: Eric Jacobs <email address hidden>
Date: Tue Jun 2 18:40:36 2015 -0400

    Don't use more than 32 cpus for network queue affinities

    IF we have more than 32 cpus, discard any cpus > 32 for network queue
    cpu affinities.

    Closes-Bug: 1461306
    Change-Id: I6aff2af39f25a32c710d5cdf9708dad6b2a201c1

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Andrew Woodward (xarses) wrote :

sysfs wasn't added to 6.0 so it's invalid for it, not won't fix

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.