OpenSSL CPU detection for AMD Ryzen CPUs

Bug #1674399 reported by Eric Desrochers on 2017-03-20
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
openssl (Ubuntu)
Status tracked in Artful
Xenial
Medium
Eric Desrochers
Yakkety
Medium
Eric Desrochers
Zesty
Medium
Eric Desrochers
Artful
Medium
Eric Desrochers

Bug Description

[Impact]

* Context:

AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement.

[1] /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD Ryzen 5 1600 Six-Core Processor
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse
4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho
pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold

[2] - sha_ni: SHA1/SHA256 Instruction Extensions

[3] - https://en.wikipedia.org/wiki/Ryzen
...
All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5]
...

* Program to performs the CPUID check:

Reference :
https://software.intel.com/en-us/articles/intel-sha-extensions

... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check:

--
int CheckForIntelShaExtensions() {
   int a, b, c, d;

   // Look for CPUID.7.0.EBX[29]
   // EAX = 7, ECX = 0
   a = 7;
   c = 0;

   asm volatile ("cpuid"
        :"=a"(a), "=b"(b), "=c"(c), "=d"(d)
        :"a"(a), "c"(c)
       );

   // Intel® SHA Extensions feature bit is EBX[29]
   return ((b >> 29) & 1);
}
--

On CPU with sha_ni the program return "1". Otherwise it return "0".

[Test Case]

 * Reproducible with Xenial/Zesty/Artful release.

 * Generated a checksum of a big file (e.g. 5GB file) with openssl
 $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile
SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8

real 0m12.835s
user 0m12.344s
sys 0m0.484s

* Openssl speed
$ openssl speed sha1
Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s
Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s
Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s
Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s
Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s
OpenSSL 1.0.2g 1 Mar 2016
built on: reproducible build, date unspecified
options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55

The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch)

[Regression Potential]

 * Note : IRC discussion with infinity :
https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8

 * Note from irc discussion with apw and rbasak :
https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2

 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability.

* The code check the CPUID bit to determine if the sha instructions are available are not.

* Maintainer comment proves that he did the successfully tested on Intel with/without SHA extension

Reference: https://github.com/openssl/openssl/issues/2848
"I don't have access to Ryzen system, so I didn't test it explicitly on Ryzen. Reporter did confirm it tough. Myself I tested on Intel processors, yes, with/without."

* LP reporter comment :
I, slashd, have tested on a Ryzen system (and AMD non-ryzen) and non-sha INTEL cpu. It does reveal a significant performance increase on Ryzen due to the sha extension :
(Note that the performance remain the same on non-sha extension CPU (AMD/INTEL), as expected since they don't take benefit of the sha extension technology)

[Tested on a Ryzen CPU]
# Generated a checksum of a big file (e.g. 5GB file) with openssl
 $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile
SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8

real 0m3.471s
user 0m2.956s
sys 0m0.516s

# Openssl speed
$ openssl speed sha1
Doing sha1 for 3s on 16 size blocks: 12081890 sha1's in 3.00s
Doing sha1 for 3s on 64 size blocks: 11563950 sha1's in 3.00s
Doing sha1 for 3s on 256 size blocks: 8375101 sha1's in 3.00s
Doing sha1 for 3s on 1024 size blocks: 3987643 sha1's in 3.00s
Doing sha1 for 3s on 8192 size blocks: 678036 sha1's in 3.00s
OpenSSL 1.0.2g 1 Mar 2016
built on: reproducible build, date unspecified
options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 64436.75k 246697.60k 714675.29k 1361115.48k 1851490.30k

[Other Info]

* Debian Bug :
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=861145

* Upstream PR :
https://github.com/openssl/openssl/issues/2848

* Upstream Repository :
https://github.com/openssl/openssl.git

* Upstream Commits :
1aed5e1 crypto/x86*cpuid.pl: move extended feature detection.
## This fix moves extended feature detection past basic feature detection where it belongs.

f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards.
## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection.

Eric Desrochers (slashd) on 2017-03-20
Changed in openssl (Ubuntu):
importance: Undecided → Low
description: updated
tags: added: sts
Changed in openssl (Ubuntu):
assignee: nobody → Eric Desrochers (slashd)
importance: Low → Medium
milestone: none → ubuntu-16.04.2
status: New → Triaged
Eric Desrochers (slashd) wrote :

Here's some context after a conversation about this bug on channel : #ubuntu-release

...
[10:01:50] <slashd> hi SRU, I'm currently working on a case (no LP bug yet).... about an OpenSSL bug on new AMD CPU (Ryzen) released last Feb ... where the SHA Extension routine is not called on AMD Ryzen cores. My question is since this look like H/W enablement ... do you think this could be eligible for SRU in stable release such like Xenial ? or this will only be accepted for devel release ? This is a new CPU but Xenial is there for a couple of years still so maybe future Xenial user running Ryzen CPU may benefit on this eventually...
[10:03:20] <apw> slashd, for me some new functionality like that is ok as long as it is very self-contained so easy to review and confirm is only used on the new h/w
[10:03:52] <apw>one of our main goals is to avoid regressions
[10:04:41] <slashd>apw, make sense, thanks for your input
[10:12:24] <rbasak> The SRU policy does explicitly permit hardware enablement in an LTS IIRC, though I'd expect ~ubuntu-sru to be involved in mitigating risk and making the final risk decision, FWIW.
[10:16:11] <apw> rbasak, right, it would have to be carefully considered once we can see what the diff actually is
[10:16:34] <slashd> rbasak, apw ack, will communite the info with the proper group
[10:16:50] <apw> with a much greater level of testing and scrutiny than a regular fix only sru
[10:17:12] <slashd> apw, rbasak, FYI I have requested the new CPU from our partner to test in deep
...

Changed in openssl (Ubuntu):
milestone: ubuntu-16.04.2 → none
Eric Desrochers (slashd) on 2017-03-27
Changed in openssl (Ubuntu):
assignee: Eric Desrochers (slashd) → nobody
Eric Desrochers (slashd) on 2017-04-21
description: updated
description: updated
Eric Desrochers (slashd) on 2017-04-21
Changed in openssl (Ubuntu Xenial):
status: New → Triaged
Changed in openssl (Ubuntu Zesty):
status: New → Triaged
Eric Desrochers (slashd) on 2017-04-23
description: updated
description: updated
description: updated
description: updated
Eric Desrochers (slashd) on 2017-04-23
description: updated
Eric Desrochers (slashd) on 2017-04-24
description: updated
description: updated
description: updated
Eric Desrochers (slashd) on 2017-04-24
description: updated
Eric Desrochers (slashd) on 2017-04-24
description: updated
Eric Desrochers (slashd) on 2017-04-24
Changed in openssl (Ubuntu Xenial):
assignee: nobody → Eric Desrochers (slashd)
Changed in openssl (Ubuntu Zesty):
assignee: nobody → Eric Desrochers (slashd)
Changed in openssl (Ubuntu Artful):
assignee: nobody → Eric Desrochers (slashd)
status: Triaged → In Progress
Eric Desrochers (slashd) wrote :

So far my test reveal the following :

# Note that the below test has been made on a Ryzen system #

[Without patch]
 * Generated a checksum of a big file (e.g. 5GB file) with openssl
 $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile
SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8

real 0m12.835s
user 0m12.344s
sys 0m0.484s

[With patch]
 * Generated a checksum of a big file (e.g. 5GB file) with openssl
 $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile
SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8

real 0m3.471s
user 0m2.956s
sys 0m0.516s

description: updated
description: updated
Eric Desrochers (slashd) wrote :

Another test "Openssl speed"

[Without patch]
 $ openssl speed sha1
Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s
Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s
Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s
Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s
Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s
OpenSSL 1.0.2g 1 Mar 2016
built on: reproducible build, date unspecified
options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55

[With patch]
 $ openssl speed sha1
Doing sha1 for 3s on 16 size blocks: 12081890 sha1's in 3.00s
Doing sha1 for 3s on 64 size blocks: 11563950 sha1's in 3.00s
Doing sha1 for 3s on 256 size blocks: 8375101 sha1's in 3.00s
Doing sha1 for 3s on 1024 size blocks: 3987643 sha1's in 3.00s
Doing sha1 for 3s on 8192 size blocks: 678036 sha1's in 3.00s
OpenSSL 1.0.2g 1 Mar 2016
built on: reproducible build, date unspecified
options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 64436.75k 246697.60k 714675.29k 1361115.48k 1851490.30k

description: updated
Eric Desrochers (slashd) on 2017-04-25
description: updated
description: updated
Eric Desrochers (slashd) on 2017-04-25
Changed in openssl (Ubuntu Zesty):
status: Triaged → In Progress
Changed in openssl (Ubuntu Xenial):
status: Triaged → In Progress
importance: Undecided → Medium
Changed in openssl (Ubuntu Zesty):
importance: Undecided → Medium
Eric Desrochers (slashd) on 2017-04-25
description: updated
Eric Desrochers (slashd) on 2017-04-26
description: updated
tags: added: sts-sru
Eric Desrochers (slashd) wrote :
Download full text (3.3 KiB)

[For SRU Verification team]

Context :
Previous IRC discussion with apw/rbasak about this case :
https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2

AMD released a new CPU called "Ryzen" that now support "Intel SHA extensions" technology.

In current Ubuntu openssl package sha extension are masked on Ryzen CPU. Ryzen is available in 64-bit only CPU (Confimed with AMD representative).

There are upstream patches that solve this situation and my test revealed that openssl has significant performance increase on a AMD CPU with sha extension capability.

The upstream patches "f8418d8" & "1aed5e1" fix the situation by moving the extended feature detection from Label (.Lintel) to Label (.Lgeneric) in both 64-bit(crypto/x86_64cpuid.pl) & 32-bit (crypto/x86cpuid.pl) code where it should belongs now that non-intel CPU can also have the capability. I cannot strictly isolate the fix for Ryzen CPU only, meaning that this feature will enable sha extension for all CPUa that support the functionality, where prior this patch it was strictly reserved for Intel CPU only.

Since I don't forsee a lot of users using the 32-bit package of opensssl on a 64-bit CPU with sha extension enabled and putting myself in a SRU mindset, I proposed the following :

- SRU the 64-bit and 32-bit patch in development release (Artful/17.10)
- SRU the 64 bit only in Stable release (Xenial/Zesty)
  - 32-bit code : Remain the same, thus no behavioural change
  - 64-bit code : Enable sha extension for 64-bit CPU that has the capabilities.

The test that I proposed before "verification-done" (while package is in -proposed)

-> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit AMD system WITH sha extension capability (Ryzen).
-> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit AMD system WITHOUT sha extension capability.
-> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit Intel system WITH sha extension capability.
-> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit Intel system WITHOUT sha extension capability.
-> In all the above tests, using perf record/annotate could also be done to validate if the sha instructions has been used or not depending on the above scenarios to conclude everything is working as expected.

No extra testing will be needed for 32-bit package since no 32-bit code will be modified.
On the other hand, if a user for some reason want to use openssl in 32-bit on a 64-bit with sha extension CPU capabilities, then this user won't have all the performance benefit of the 64-bit code fix.

Considering that this request is a "HW enablement", thus not a bugfix and the above notes... Is this would be eligible for SRU in Stable Release ?

[1] - Openssl speed measurement using sha1
$...

Read more...

Eric Desrochers (slashd) wrote :

Attaching Artful debdiff

tags: added: patch
Eric Desrochers (slashd) wrote :

Here's the highligh of the discussion I had in #ubuntu-release with infinity about my proposal in comment #6.

<slashd>For SRU, I had a talk with apw and rbasak about this bug a couples weeks ago LP: #1674399, could you please look at this bug and based on the Descriptions and comment #6 if this looks eligible for SRU in Stable release ? (note that this is a HW enablement, not a bug, this is why I'm requesting you to have a look at it) thanks in advance.

<infinity> slashd: I disagree with your reasoning for not fixing both 64 and 32.

<infinity> slashd: Lots of people run 64/32 multiarch and would benefit from fixing both.

<slashd> infinity, I'm fine with fixing 32bit, I proposed that approach cause apw wanted to self-contained the fix as much as possible

<infinity> That doesn't really contain it much. ;)

<slashd> infinity, so what if I do the same proposition but including 32-bit in stable release, would that work for you ?

<infinity> slashd: Conceptually, I have no issues with the plan (other than the "please do 32-bit too" comment).

<slashd> infinity, sure, I'm actually glad you are keen to see the 32-bit portion included

<infinity> slashd: Upload away, IMO.

<slashd> infinity, I'll then start the upload for Artful, note that starting next week I'll be gone for 2 weeks for sprints, and won't be able to do much testing, so do you think it's preferable we only start the SRU when I get back or we upload this week and worst case it will languish in -proposed for ~2weeks which will allow ppls to test with no stress (if any volunteer)

<infinity> slashd: I think letting it fester in proposed for two weeks to see if we get random negative feedback is entirely fine. Obviously, I'll delete/revert it if it breaks anything, but I don't need you around for that.

<infinity> slashd: 2 weeks of random user installations plus you executing a more precision test plan should give us solid confidence.

description: updated
Eric Desrochers (slashd) wrote :

zesty_openssl_lp1674399.debdiff

Eric Desrochers (slashd) wrote :

xenial_openssl_lp1674399.debdiff

Changed in openssl (Ubuntu Artful):
status: In Progress → Triaged
status: Triaged → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package openssl - 1.0.2g-1ubuntu12

---------------
openssl (1.0.2g-1ubuntu12) artful; urgency=medium

  * crypto/x86*cpuid.pl: move extended feature detection. (LP: #1674399)
    This fix moves extended feature detection past basic feature
    detection where it belongs. 32-bit counterpart is harmonized too.

 -- Eric Desrochers <email address hidden> Tue, 25 Apr 2017 18:16:18 -0400

Changed in openssl (Ubuntu Artful):
status: Fix Committed → Fix Released
Eric Desrochers (slashd) on 2017-04-27
Changed in openssl (Ubuntu Yakkety):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Eric Desrochers (slashd)
Eric Desrochers (slashd) on 2017-04-28
description: updated
Eric Desrochers (slashd) wrote :

zesty_openssl_lp1674399.debdiff

description: updated
Eric Desrochers (slashd) wrote :

yakkety_openssl_lp1674399.debdiff

Hello Eric, or anyone else affected,

Accepted openssl into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g-1ubuntu9.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in openssl (Ubuntu Yakkety):
status: In Progress → Fix Committed
tags: added: verification-needed
Eric Desrochers (slashd) wrote :
Download full text (15.6 KiB)

[Verificaton YAKKETY]

# i386
- Significant performance increase using the yakkety-proposed/i386 package inside a 32-bit LXD container build using a Ryzen CPU with Intel SHA Extension capability.
- Same performance (as expected) using the yakkety-proposed/i386 package on a non SHA Extension Intel CPU (i7-6770HQ) with yakkety-proposed package.

# amd64
- Significant performance increase using the yakkety-proposed/amd64 package on Ryzen CPU with Intel SHA Extension capability.
- Same performance (as expected) using the yakkety-proposed/amd64 package on a non SHA Extension Intel CPU (i7-6770HQ) with yakkety-proposed package.

Note : I unfortunately don't (nor colleagues) have access to a Intel CPU with SHA Extension capability at our disposal. Ideally, if someone has access to one to test it would be good.
Otherwise, I think it is safe to rely on upstream author of the patch who confirmed it was working as expected using a Intel CPU with SHA extension capability.

Reference : https://github.com/openssl/openssl/issues/2848
"...Myself I tested on Intel processors, yes, with/without...."

==
* Test yakkety-proposed/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU (Version before -proposed pkg):
--
ii libssl1.0.0:i386 1.0.2g-1ubuntu9.1 i386 Secure Sockets Layer toolkit - shared libraries
ii openssl 1.0.2g-1ubuntu9.1 i386 Secure Sockets Layer toolkit - cryptographic utility

# openssl speed sha1
Doing sha1 for 3s on 16 size blocks: 12441833 sha1's in 3.00s
Doing sha1 for 3s on 64 size blocks: 8997589 sha1's in 3.00s
Doing sha1 for 3s on 256 size blocks: 5074636 sha1's in 3.00s
Doing sha1 for 3s on 1024 size blocks: 1904828 sha1's in 3.00s
Doing sha1 for 3s on 8192 size blocks: 304739 sha1's in 3.00s
OpenSSL 1.0.2g 1 Mar 2016
built on: reproducible build, date unspecified
options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fdebug-prefix-map=/build/openssl-OIx07U/openssl-1.0.2g=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOUR
CE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 66356.44k 191948.57k 433035.61k 650181.29k 832140.63k

# time openssl dgst -sha256 /var/tmp/5Gfile
SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5

real 0m15.429s
user 0m14.372s
sys 0m1.052s
==
* Test yakkety-proposed/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU (With -proposed pkg):
--
ii libssl1.0.0:i386 1.0.2g-1ubuntu9.2 i386 Secure Sockets Layer toolkit - shared libraries
ii openssl ...

tags: added: verification-done-yakkety
removed: verification-needed
Łukasz Zemczak (sil2100) wrote :

Hello Eric, or anyone else affected,

Accepted openssl into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g-1ubuntu11.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in openssl (Ubuntu Zesty):
status: In Progress → Fix Committed
tags: added: verification-needed
Changed in openssl (Ubuntu Xenial):
status: In Progress → Fix Committed
Łukasz Zemczak (sil2100) wrote :

Hello Eric, or anyone else affected,

Accepted openssl into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g-1ubuntu4.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

We have tested these packages in zesty-proposed (openssl-1.0.2g-1ubuntu11.1) and can confirm that the SHA extension codepath is executed correctly and we see the accompanying expected performance improvements.

Thanks!

Eric Desrochers (slashd) wrote :
Download full text (14.7 KiB)

[Verificaton XENIAL]

# i386
- Significant performance increase using the xenial-proposed/i386 package inside a 32-bit LXD container build using a Ryzen CPU with Intel SHA Extension capability.
- Same performance (as expected) using the xenial-proposed/i386 package on a non SHA Extension Intel CPU (i7-6770HQ) with xenial-proposed package.

# amd64
- Significant performance increase using the xenial-proposed/amd64 package on Ryzen CPU with Intel SHA Extension capability.
- Same performance (as expected) using the xenial-proposed/amd64 package on a non SHA Extension Intel CPU (i7-6770HQ) with xenial-proposed package.

Note : I unfortunately don't (nor colleagues) have access to a Intel CPU with SHA Extension capability at our disposal. Ideally, if someone has access to one to test it would be good.
Otherwise, I think it is safe to rely on upstream author of the patch who confirmed it was working as expected using a Intel CPU with SHA extension capability.

Reference : https://github.com/openssl/openssl/issues/2848
"...Myself I tested on Intel processors, yes, with/without...."

==
* Test xenial/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU:
--
ii libssl1.0.0:i386 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - shared libraries
ii openssl 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - cryptographic utility

# openssl speed sha1
Doing sha1 for 3s on 16 size blocks: 12391058 sha1's in 3.00s
Doing sha1 for 3s on 64 size blocks: 8934411 sha1's in 3.00s
Doing sha1 for 3s on 256 size blocks: 5048901 sha1's in 3.00s
Doing sha1 for 3s on 1024 size blocks: 1893157 sha1's in 3.00s
Doing sha1 for 3s on 8192 size blocks: 301374 sha1's in 3.00s
OpenSSL 1.0.2g 1 Mar 2016
built on: reproducible build, date unspecified
options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx)
compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 66085.64k 190600.77k 430839.55k 646197.59k 822951.94k

# time openssl dgst -sha256 /var/tmp/5Gfile
SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5

real 0m15.518s
user 0m14.428s
sys 0m1.084s
==
* Test xenial-proposed/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU:
--
ii libssl1.0.0:i386 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - shared libraries
ii openssl 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - cryptographic utility

# ope...

tags: added: verification-done-xenial verification-done-zesty
tags: added: ua
removed: sts verification-needed
Eric Desrochers (slashd) wrote :

The same precision verification testing has been tested for zesty-proposed with the same result as X and Y :

[Verificaton zesty]

# i386
- Significant performance increase using the zesty-proposed/i386 package inside a 32-bit LXD container build using a Ryzen CPU with Intel SHA Extension capability.
- Same performance (as expected) using the zesty-proposed/i386 package on a non SHA Extension Intel CPU (i7-6770HQ) with zesty-proposed package.

# amd64
- Significant performance increase using the zesty-proposed/amd64 package on Ryzen CPU with Intel SHA Extension capability.
- Same performance (as expected) using the zesty-proposed/amd64 package on a non SHA Extension Intel CPU (i7-6770HQ) with zesty-proposed package.

Note : I unfortunately don't (nor colleagues) have access to a Intel CPU with SHA Extension capability at our disposal. Ideally, if someone has access to one to test it would be good.
Otherwise, I think it is safe to rely on upstream author of the patch who confirmed it was working as expected using a Intel CPU with SHA extension capability.

Reference : https://github.com/openssl/openssl/issues/2848
"...Myself I tested on Intel processors, yes, with/without...."

Additionally, we also had some feedbacks from
Justin Erenkrantz, a affected users using a Ryzen/Naple CPU.

Please look comment #18 to see Justin feedback:
https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/18

tags: added: sts-sru-done
removed: sts-sru
William Grant (wgrant) wrote :

libssl1.0.0 1.0.2g-1ubuntu9.2 breaks OpenVPN (2.4.0-5ubuntu1 or 2.3.11-1ubuntu2) connections to Canonical's VPN on my Ryzen 7 1700X desktop running Linux 4.10.0-21-generic. In UDP mode the server stops responding during TLS negotiation, and in TCP mode the server closes the connection at the same stage. Downgrading to ubuntu9.1 fixes it. artful's 1.0.2g-1ubuntu12 is broken in the same way. The HMAC in use by the VPN is SHA-1.

From the server log:

ovpn-tcp[30227]: TCP connection established with [AF_INET]<REDACTED>:44544
ovpn-tcp[30227]: <REDACTED>:44544 TCP connection established with [AF_INET]<REDACTED>:47753
ovpn-tcp[30227]: <REDACTED>:44544 TLS_ERROR: BIO read tls_read_plaintext error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
ovpn-tcp[30227]: <REDACTED>:44544 TLS Error: TLS object -> incoming plaintext read error
ovpn-tcp[30227]: <REDACTED>:44544 TLS Error: TLS handshake failed
ovpn-tcp[30227]: <REDACTED>:44544 Fatal TLS error (check_tls_errors_co), restarting

The start of a client log is at http://paste.ubuntu.com/24603459/. Until the connection is closed by the server, it differs from a successful connection only in its keys and session IDs.

tags: added: regression-proposed
William Grant (wgrant) wrote :

Fortunately the OpenSSL test suite also fails when run during the build on Ryzen. It turns out that the AES-NI+SHA-NI AES-CBC+SHA{1,256} implementations are both broken, so https://github.com/openssl/openssl/commit/08d09628d2c9f3ef599399d8cad021a07ab98347 needs to be backported too. I guess nobody's seriously used Ubuntu on Goldmont.

I've uploaded fixed SRU test builds to https://launchpad.net/~wgrant/+archive/ubuntu/experimental/+packages?field.name_filter=openssl, and they all build and test successfully on i386 and amd64 on Ryzen. At least 9.3~ppa1 even lets OpenVPN connect, with accelerated hashing. I don't think we really need to dig up a Goldmont device from somewhere, but if someone has one handy...

Eric Desrochers (slashd) wrote :

Thanks William, I'll set the proposed pkg as verification-failed, and will work on backporting the patch[1] you are suggesting and that has been proven to fix the issue.

[1] - https://github.com/openssl/openssl/commit/08d09628d2c9f3ef599399d8cad021a07ab98347

Eric

tags: added: verification-failed
removed: verification-done-xenial verification-done-yakkety verification-done-zesty
Eric Desrochers (slashd) wrote :

William, as per our IRC conversation, we have decided that you will do the upload for this specific fix for the 4 releases.

Thanks for your collaboration.

- Eric

Łukasz Zemczak (sil2100) wrote :

Hello Eric, or anyone else affected,

Accepted openssl into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g-1ubuntu11.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: removed: verification-failed
tags: added: verification-needed
Łukasz Zemczak (sil2100) wrote :

Hello Eric, or anyone else affected,

Accepted openssl into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g-1ubuntu9.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Łukasz Zemczak (sil2100) wrote :

Hello Eric, or anyone else affected,

Accepted openssl into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g-1ubuntu4.8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.