CPU/memory affinity info missing

Bug #1251209 reported by Dave Love
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
procenv
Fix Released
Undecided
James Hunt

Bug Description

There's currently no report of CPU/memory affinity I can see.
It's probably useful at least to report the cpus_allowed_list and mems_allowed_list from
proc's status file on Linux, but it would be better to use the hwloc library to report it portably
in terms of logical topology, if you don't mind the extra dependency. I could do an
hwloc version.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Hi Dave,

> There's currently no report of CPU/memory affinity I can see.
You're right!

> It's probably useful at least to report the cpus_allowed_list and
> mems_allowed_list from proc's status file on Linux, but it would be
> better to use the hwloc library to report it portably
> in terms of logical topology, if you don't mind the extra dependency. I
> could do an hwloc version.
I've now got some code to display cpu affinity (using sched_getaffinity()).

NUMA looks slightly more awkward for 2 reasons:

1) Although I've got some code that uses get_mempolicy(), that syscall requires libnuma (!) seemingly in my case only to give
    the correct platform-specific syscall number for get_mempolicy(). This seems to be somewhat overkill but I'll take a closer
    look tomorrow.

2) Testing: get_mempolicy() returns ENOSYS on non-NUMA systems (like mine :) Do you have access to NUMA hardware?

Would the above suffice do you think or is there a compelling reason to use libhwloc? As you've deduced, I'm trying to minimise dependencies on additional libraries. That said, I don't want to reinvent too many wheels and have been wondering about libcpuid for displaying proper virtualisation details (although this doesn't seem to be widely packaged yet).

Revision history for this message
Dave Love (fx-gnu) wrote :

James Hunt <email address hidden> writes:

> I've now got some code to display cpu affinity (using sched_getaffinity()).
>
> NUMA looks slightly more awkward for 2 reasons:
>
> 1) Although I've got some code that uses get_mempolicy(), that syscall
> requires libnuma (!)

I didn't realize that, but I don't use it directly.

> seemingly in my case only to give
> the correct platform-specific syscall number for get_mempolicy(). This seems to be somewhat overkill but I'll take a closer
> look tomorrow.
>
> 2) Testing: get_mempolicy() returns ENOSYS on non-NUMA systems (like
> mine :) Do you have access to NUMA hardware?

Around 350 systems of various types -- I do HPC and recommend procenv
for debugging possible environmental problems with batch jobs. The CPU
binding(/affinity) is actually a lot more important for my purposes than
memory affinity.

I can certainly test if it helps.

> Would the above suffice do you think or is there a compelling reason to
> use libhwloc?

It's possibly nice to have the logical values, but probably the only
compelling reason is if you want to support other kernels (even kfreebsd
in the Debian world?). Anything that basically says what cores you're
bound to would be useful, so that should suffice.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Good news - I've now got access to some NUMA hardware :)

Changed in procenv:
assignee: nobody → James Hunt (jamesodhunt)
status: New → In Progress
Revision history for this message
James Hunt (jamesodhunt) wrote :

Hi Dave,

The latest lp:procenv will now dump cpu affinity and NUMA details (I've ended up adding a dependency on libnuma as it's rather too awkward to not use this library :)

Please could you try a build on one of your NUMA systems and see if the output looks sane?

Changed in procenv:
status: In Progress → Fix Committed
Revision history for this message
Dave Love (fx-gnu) wrote :

Thanks. I'll try to try it tomorrow, but it might be Monday.

Revision history for this message
Dave Love (fx-gnu) wrote :

James Hunt <email address hidden> writes:

> Hi Dave,
>
> The latest lp:procenv will now dump cpu affinity and NUMA details (I've
> ended up adding a dependency on libnuma as it's rather too awkward to
> not use this library :)
>
> Please could you try a build on one of your NUMA systems and see if the
> output looks sane?

Initially, I had to move it to a system with a more recent autoconf to
bootstrap but got

  configure.ac:59: option `serial-tests' not recognized

trying to reconf using autoconf 2.69/automake 1.11.6. I removed the
option and carried on.

Building it under RHEL6 and running on our cluster sandybridge login
node I get sensible-looking output with -C, but if I run it on a compute
node I get:

  cpu:
    number: 1 of 16
    affinity list: (null)

The only significant difference between them I think is that the login
node has hyperthreading on and the compute node which fails to fill in
the list doesn't. A westmere node without hyperthreading looks OK. I
don't have time to debug the failed case currently.

The other AMD systems I have are mostly under RHEL5, and procenv won't
build on that. I'll send a fix for 0.27 separately, but the bzr version
has the additional failures:

  gcc -DHAVE_CONFIG_H -I. -I.. -pedantic -std=gnu99 -Wall -Wunused -fstack-protector -Wformat -Werror=format-security -g -O2 -MT procenv-procenv.o -MD -MP -MF .deps/procenv-procenv.Tpo -c -o procenv-procenv.o `test -f 'procenv.c' || echo './'`procenv.c
  procenv.c: In function ‘show_cpu_affinities’:
  procenv.c:1913: warning: implicit declaration of function ‘CPU_ALLOC’
  procenv.c:1913: warning: assignment makes pointer from integer without a cast
  procenv.c:1916: warning: implicit declaration of function ‘CPU_ALLOC_SIZE’
  procenv.c:1917: warning: implicit declaration of function ‘CPU_ZERO_S’
  procenv.c:1986: warning: implicit declaration of function ‘CPU_FREE’
  procenv.c: In function ‘show_numa_memory’:
  procenv.c:3075: warning: implicit declaration of function ‘numa_num_possible_nodes’
  procenv.c:3076: warning: implicit declaration of function ‘numa_num_configured_nodes’
  procenv.c:3078: warning: implicit declaration of function ‘numa_get_mems_allowed’
  procenv.c:3078: warning: assignment makes pointer from integer without a cast
  procenv.c:3082: error: dereferencing pointer to incomplete type
  procenv.c:3083: warning: implicit declaration of function ‘numa_bitmask_isbitset’
  procenv.c: In function ‘show_linux_cpu’:
  procenv.c:5600: warning: implicit declaration of function ‘sched_getcpu’

The hwloc source might show what to do more portably; it works on RH5,
at least.

HTH.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Thanks for testing. Yes, serial-tests is a pita: https://lists.gnu.org/archive/html/automake/2013-01/msg00060.html

I'll take a closer look at all this early next week.

Revision history for this message
James Hunt (jamesodhunt) wrote :

I've now resolved the serials-tests issue (I think) if you want to re-test lp:procenv?

Also, would it be possible to see examples of /proc/self/status on the RHEL 5+6 NUMA systems if possible? Centos 5.10 atleast seems to have the older v1 API libnuma (which explains those numa function errors above) but I haven't looked to see if it was simply limited to what that kernel was able to provide in terms of queryables.

The CPU_ALLOC errors I think come from the fact that Centos 5.10 doesn't provide a new enough libc to deal with dynamically sized cpu sets.

Revision history for this message
Dave Love (fx-gnu) wrote :
  • status Edit (808 bytes, application/octet-stream)
  • status Edit (920 bytes, application/octet-stream)

James Hunt <email address hidden> writes:

> I've now resolved the serials-tests issue (I think) if you want to re-
> test lp:procenv?
>
> Also, would it be possible to see examples of /proc/self/status on the
> RHEL 5+6 NUMA systems if possible? Centos 5.10 atleast seems to have the
> older v1 API libnuma (which explains those numa function errors above)
> but I haven't looked to see if it was simply limited to what that kernel
> was able to provide in terms of queryables.
>
> The CPU_ALLOC errors I think come from the fact that Centos 5.10 doesn't
> provide a new enough libc to deal with dynamically sized cpu sets.

I think 5.10 has the necessary info. Hwloc works fine with it for
everything I've done with topology and affinity. Here are status file
examples.

Revision history for this message
James Hunt (jamesodhunt) wrote :

I'm rather confused by the null affinity list you're seeing in #6 as I explictly handle that scenario. Did you definitely take the latest lp:procenv? I'm also confused by why libnuma is giving you different results to hwloc since they both use the same system call fwics. TBH, I'd rather not use either library but libnuma is atleast simple - I have no idea at this point how to use hwloc to query numa details at this stage. I get the feeling that in theory it would be rather overkill to link to hwloc just to obtain numa stats?

Revision history for this message
Dave Love (fx-gnu) wrote :

You wrote:

> I'm rather confused by the null affinity list you're seeing in #6 as I
> explictly handle that scenario. Did you definitely take the latest
> lp:procenv?

Yes, at the time. Anyhow I've now updated and debugged it.

> I'm also confused by why libnuma is giving you different
> results to hwloc since they both use the same system call fwics. TBH,
> I'd rather not use either library but libnuma is atleast simple - I have
> no idea at this point how to use hwloc to query numa details at this
> stage.

I'm not sure what you mean by numa details. I can't remember the API,
but it's straightforward to get the binding of a process, should you
need it.

> I get the feeling that in theory it would be rather overkill to
> link to hwloc just to obtain numa stats?

The advantage of hwloc is portability, I guess, and treatment of the
topology logically.

Anyhow. With the current code on the 16-core system with hyperthreads
off that failed, the getaffinity call returns EINVAL and that isn't
tested correctly.

hwloc says that the initial size may not be big enough, and iterates
until the call works. This works for me. I don't know whether the
GNU_BSD/HURD case should be treated the same.

Revision history for this message
James Hunt (jamesodhunt) wrote :

Hi Dave,

OK - I think you've convinced me that hwloc is the way to go here. If you're happy to work on a branch great, otherwise I may need some guidance on determining the correct calls as frankly I'm finding the API to hwloc somewhat opaque :)

Revision history for this message
Dave Love (fx-gnu) wrote :

James Hunt <email address hidden> writes:

> Hi Dave,
>
> OK - I think you've convinced me that hwloc is the way to go here. If
> you're happy to work on a branch great, otherwise I may need some
> guidance on determining the correct calls as frankly I'm finding the API
> to hwloc somewhat opaque :)

I didn't really intend to do the convincing. It's maybe relevant if you
want to support the kernels that hwloc does, or present bindings in
"logical" format, but I'm not sure it's worth it otherwise. I'm happy
with the patch I made, which solves the problem.

The hwloc thing that's the most relevant example is hwloc-ps, which
shows the bindings in use on the system. (I think there ought to be a
"binding of process P" option, rather than mapping over all processes,
but there currently isn't.)

Revision history for this message
James Hunt (jamesodhunt) wrote :

Hi Dave - sorry: I had completely missed the fact you'd attached that patch! Now applied to lp:procenv if you want to re-test.

fyi, I plan to tweak the tests a little and put out a release in the next few days.

Thanks again!

James Hunt (jamesodhunt)
Changed in procenv:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.