libatlas not using vector instructions - large performance impact
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
High
|
Canonical Foundations Team | ||
atlas (Ubuntu) |
Fix Released
|
Undecided
|
Dimitri John Ledkov |
Bug Description
The libatlas library delivered with Ubuntu 18.04.1 is build for zEC12. There is no alternative library available for z13 and z14 exploiting the vector instructions. The source package from Ubuntu seems to have the z13 patches applied.
---uname output---
Linux m42lp10 4.15.0-29-generic #31-Ubuntu SMP Tue Jul 17 15:42:24 UTC 2018 s390x s390x s390x GNU/Linux
---Additional Hardware Info---
standard Ubuntu install
Machine Type = z13, z14
---Debugger---
A debugger is not configured
---Steps to Reproduce---
Install libatlas, call any of the standard blas routines, observe that there are no Z instructions.
Userspace tool common name: libatlas
The userspace tool has the following bit modes: 64 bit
Userspace deb: atlas package in Ubuntu
Userspace tool obtained from project website: na
tags: | added: architecture-s39064 bugnameltc-173087 severity-high targetmilestone-inin--- |
Changed in ubuntu: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
affects: | ubuntu → linux (Ubuntu) |
tags: |
added: targetmilestone-inin1804 removed: targetmilestone-inin--- |
affects: | linux (Ubuntu) → atlas (Ubuntu) |
tags: | added: universe |
Changed in ubuntu-z-systems: | |
importance: | Undecided → High |
Changed in atlas (Ubuntu): | |
assignee: | Skipper Bug Screeners (skipper-screen-team) → Dimitri John Ledkov (xnox) |
Changed in ubuntu-z-systems: | |
status: | New → Triaged |
Canonical focuses on having a single library build for each architecture in it's archive, containing all possible optimizations (LP 1702917, #4).
If a separate z13 optimized library is desired it needs to be placed in a PPA - for example.
I think that optimizations should be ideally addressed in the upstream code - for example with (#ifndef) approaches like HW_CAPS or with S390_ALTERNATIVE macros.
Any thoughts and opinions?