Python 2.7.12 performance regression

Bug #1638695 reported by Major Hayden on 2016-11-02
114
This bug affects 22 people
Affects Status Importance Assigned to Milestone
python2.7 (Ubuntu)
High
Unassigned
Xenial
High
Unassigned
Zesty
Undecided
Unassigned

Bug Description

SRU: Looks like only the math.o build without -fPIC makes it into the SRU. There shouldn't be any regression potential when building without -fPIC for the static interpreter. Acceptance criteria is running the benchmarks and not showing any performance regressions.

I work on the OpenStack-Ansible project and we've noticed that testing jobs on 16.04 take quite a bit longer to complete than on 14.04. They complete within an hour on 14.04 but they normally take 90 minutes or more on 16.04. We use the same version of Ansible with both versions of Ubuntu.

After more digging, I tested python performance (using the 'performance' module) on 14.04 (2.7.6) and on 16.04 (2.7.12). There is a significant performance difference between each version of python. That is detailed in a spreadsheet[0].

I began using perf to dig into the differences when running the python performance module and when using Ansible playbooks. CPU migrations (as measured by perf) are doubled in Ubuntu 16.04 when running the same python workloads.

I tried changing some of the kerne.sched sysctl configurables but they had very little effect on the results.

I compiled python 2.7.12 from source on 14.04 and found the performance to be unchanged there. I'm not entirely sure where the problem might be now.

We also have a bug open in OpenStack-Ansible[1] that provides additional detail. Thanks in advance for any help you can provide!

[0] https://docs.google.com/spreadsheets/d/18MmptS_DAd1YP3OhHWQqLYVA9spC3xLt4PS3STI6tds/edit?usp=sharing
[1] https://bugs.launchpad.net/openstack-ansible/+bug/1637494

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in python2.7 (Ubuntu):
status: New → Confirmed
Matthias Klose (doko) wrote :

> I compiled python 2.7.12 from source on 14.04
> and found the performance to be unchanged there.

unchanged compared to what? the python binaries in 14.04, or 16.04?

could you check with a python version in between? e.g.
https://launchpad.net/~ubuntu-toolchain-r/+archive/ubuntu/ppa

Major Hayden (rackerhacker) wrote :

Hello Matthias,

I'm sorry for the confusion there. What I meant is that I compiled 2.7.12 on 14.04 and found that it had the same performance as 2.7.6 (from the default Ubuntu python package) on 14.04. I also loaded Xenial's kernel on the 14.04 installation and found no performance difference either.

The problem seems to be unique to 2.7.12 on 16.04.

Matthias Klose (doko) wrote :

please try to build using gcc-4.8 on 16.04 LTS (it's still available in the archive)

Major Hayden (rackerhacker) wrote :

I can try that. Just to be clear, you're suggesting to do the following:

1) Install gcc-4.8 on 16.04
2) Compile 2.7.12 with gcc-4.8 on 16.04
3) Re-run tests

Did I get that right?

Matthias Klose (doko) wrote :

yes, setting CC=gcc-4.8 CXX=g++-4.8 ./configure ...

Major Hayden (rackerhacker) wrote :

Thanks for confirming that, Matthias. Testing with GCC 4.8 seemed to yield (mostly) better results. I put the data into a Google Sheet:

  https://goo.gl/9gW82j

Out of the 10 pyperformance tests:

  * 3 tests were actually faster with python compiled w/gcc-4.8
  * 4 tests were slightly slower (but within 5%)
  * 3 tests were ~ 20-25% slower

Overall, these numbers look quite a bit better.

Jorge Niedbalski (niedbalski) wrote :

Hello,

I am in the process of verifying this performance regression on a 16.04 Xenial machine using the kernel Linux-4.4.0-38.

I ran a locally compiled python 2.7.12 version built with different versions of GCC, 5.3.1 (current) and 4.8.0 both coming from the Ubuntu archives.

The benchmark suite I am using is the pyperformance suite (https://github.com/python/performance), I am running the full test suite, using the following command:

$ pyperformance run --python=python2 -o xxx.json

According to the latest test run i did using Python 2.7.12/GCC 4.8 (using GCC 5.3.1 as the baseline), 50% of the tests (32/64) have a significant variance in performance from which 19/32 are slower (in times ranging from 5-15%).

Just for information, I am comparing results using the following command:

$ pyperformance compare python-2.7.12-gcc-5.3.1.json python-2.7.12-gcc-4.8.0.json

I am attaching here the current comparison results for analysis.

Matthew Thode (prometheanfire) wrote :

This may not be the best comparison, as I don't have gcc 4.8.0 (I could test with gcc 4.8.5 though) Also, using a different toolchain, glibc-2.22 as well. But here are my outputs, attached showing 4.9.3 and and 5.4.0.

Changed in python2.7 (Ubuntu):
importance: Undecided → High
assignee: nobody → Jorge Niedbalski (niedbalski)
Jorge Niedbalski (niedbalski) wrote :
Download full text (3.5 KiB)

Hello,

I have been working to track down the origin of the performance penalty exposed by this bug.

All the tests that I am performing are made on top of a locally compiled version of python 2.7.12 (from upstream sources, not applying any ubuntu patch on it)
built with different versions of GCC, 5.3.1 (current) and 4.8.0 both coming from the Ubuntu archives.

I can see important performance differences as I mentioned on my previous comments (check the full comparisons stats) just by
switching the GCC version. I decided to focus my investigation on the pickle module, since it seems to be the most affected one being
approximately 1.17x slower between the different gcc versions.

Due to the amount of changes introduced between 4.8.0 and 5.3.1 I decided to not persue the approach
of doing a bisection of the changes for identifying an offending commit yet, until we can identify which optimization or change
at compile time is causing the regression and focus our investigation on that specific area.

My understanding is that the performance penalty caused by the compiler might be related
to 2 factors, a important change on the linked libc or a optimization made by the compiler in the resulting object.

Since the resulting objects are linked against the same glibc version 2.23, I will not consider that factor as part of the analysis,
instead I will focus on analyzing the performance of the resulting objects generated by the compiler.

For following this approach I ran the pyperformance suite and used a valgrind session excluding all the modules with the exception of the pickle module,
using the default supressions to avoid missing any reference in the python runtime with the following arguments:

valgrind --tool=callgrind --instr-atstart=no --trace-children=yes venv/cpython2.7-6ed9b6df9cd4/bin/python -m performance run --python /usr/local/bin/python2.7 -b pickle --inside-venv

I did run this process multiple times with both GCC 4.8.0 and 5.3.1 to produce a large set of callgrind files to analyze , those callgrind files contains the full tree of execution
including all the relocations, jumps, calls to the libc and the python runtime itself and of course time spent per function and the amount of calls made to it.

I cleaned out all the resulting callgrind files removing the files smaller than 100k and the ones that were not loading the cPickle
extension (https://pastebin.canonical.com/175951/).

Over that set of files I executed callgrind_annotate to generate the stats per function ordered by the exclusive cost of function,
Then with this script (http://paste.ubuntu.com/23795048/
) I added all the costs per function per GCC version (4.8 and 5.3.1) and then I calculated the variance in cost between them.

The resulting file contains a tuple with the following format:

function name - gcc 4.8 cost - gcc 5.3.1 cost - variance in percent

As an example:

/home/ubuntu/python/cpython/Objects/tupleobject.c:tupleiter_dealloc 258068.000000 445009.000000 (variance: 0.724387)
/home/ubuntu/python/cpython/Objects/object.c:try_3way_compare 984860.000000 1676351.000000 (variance: 0.702121)
/home/ubuntu/python/cpython/Python/marshal.c:r_object 183524.000000 2...

Read more...

Just a small precision about Jorge's last comment above (<https://bugs.launchpad.net/ubuntu/+source/python2.7/+bug/1638695/comments/10>):

"I cleaned out all the resulting callgrind files removing the files smaller than 100k and the ones that were not loading the cPickle extension (https://pastebin.canonical.com/175951/)."

That URL is not publicly accessible, here are the commands Jorge ran:

find . -type f -size -100k -exec rm {} \;
for f in $(ack-grep -i pickle | grep callgrind.out | cut -d":" -f1 | uniq); do mv $f $f.pick; done
for f in $(ls *.pick); do callgrind_annotate $f > $f.annotate; done

Louis Bouchard (louis) wrote :

Hello,

Just to clarify something that I have just realized using :

$ pyperformance run -p={some python} means that {some python} will be used to run PYPERFORMANCE, not to run the benchmarks !!! So changing -p to use different builds of python will not run proper comparaison of the different builds.

Louis Bouchard (louis) wrote :

Here are the results of the comparative tests I ran :

https://docs.google.com/spreadsheets/d/1MyNBPVZlBeic1OLqVKe_bcPk2deO_pQs9trIfOFefM0/edit#gid=2034603487

It confirms the assumptions but unfortunately, rebuilding 2.7.12 without the -fstack-protector-strong leads to worse performances than the stock 2.7.12 build. I'm continuing my investigations.

Where are these tests being executed on? Are these virtual machines or bare-metal instances? If these are VMs, what hypervisor is being used?

Major Hayden (rackerhacker) wrote :

My testing was done on Xen virtual machines, KVM virtual machines, and bare metal.

Louis Bouchard (louis) wrote :

Hello,

The tests are run in LXC containers on a bare metal server with two physical CPU, 6 cores, 2 threads per core (Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz).

Following's doko's advice, I have built two new versions, one with LTO optimisation disabled and the other one with PGO optimisation disabled. In both cases, it makes things worse.

Ryan Beisner (1chb1n) on 2017-02-21
tags: added: uosci
Louis Bouchard (louis) wrote :

Hello,

Following doko's advice, I ran a set of test with PGO & LTO optimization disabled.

Here are the results : https://docs.google.com/spreadsheets/d/1tTlEOvMypwKwi99XHjvuQFE14_jpBBLy0-Mk6bjkvL0/edit#gid=1169944329

This may bring more light to the investigation as it appear that with LTO & PGO optimisation disabled on Trusty, the trusty version becomes slower than the Xenial stock version. Disabling optimisation on Xenial makes little difference though.

So maybe the PGO & LTO optimisation on Trusty is more efficient than on Xenial and leads to better results,hence better performance. Just a thought

Louis Bouchard (louis) on 2017-02-27
Changed in python2.7 (Ubuntu):
assignee: Jorge Niedbalski (niedbalski) → Louis Bouchard (louis-bouchard)
Louis Bouchard (louis) wrote :
Download full text (12.2 KiB)

Following the results of the previous comparison, i've used Jorge's profiling example on the 'call_method' bench for trusty stock, no LTO, no PGO and Xenial stock, no PGO and no LTO. Here are the results. Notice the difference between Trusty Stock & Trusty nopgo, as opposed to the execution profiles of the other tests.

Trusty Stock
============
callgrind_annotate:
Profiled target: /home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python /home/ubuntu/venv/cpython2.7-d0d7712d4e1d/li
b/python2.7/site-packages/performance/benchmarks/bm_call_method.py --worker --pipe 4 --worker-task=0 --samples 3 --
warmups 1 --loops 1 --min-time 0.1 (PID 25150, part 1)
4,918,198,605 ???:PyEval_EvalFrameEx'2 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
1,180,507,700 ???:0x00000000005368f0 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
1,109,707,368 ???:PyObject_GetAttr [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
  823,065,734 ???:PyFrame_New [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
  552,755,137 ???:0x00000000004a5c90 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
  525,836,692 ???:0x00000000004bedf0 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
   12,732,711 ???:PyParser_AddToken [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    6,120,934 ???:PyDict_GetItem [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    4,700,333 ???:0x00000000004bc0e0 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    4,647,564 ???:0x00000000004afe90'2 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    3,724,240 ???:PyObject_Free [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    3,526,112 ???:PyDict_SetItem [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    3,407,575 ???:0x0000000000571fd0 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    3,304,198 ???:PyObject_Hash [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    3,055,436 ???:0x00000000005495a0 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]
    2,796,306 ???:0x0000000000535070 [/home/ubuntu/venv/cpython2.7-d0d7712d4e1d/bin/python2.7]

Trusty nopgo:
=============
Profiled target: /home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python /home/ubuntu/venv/cpython2.7-d217262e7ee7/li
b/python2.7/site-packages/performance/benchmarks/bm_call_method.py --worker --pipe 4 --worker-task=0 --samples 3 --
warmups 1 --loops 1 --min-time 0.1 (PID 28073, part 1)
5,362,602,828 ???:PyEval_EvalFrameEx'2 [/home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python2.7]
1,250,195,637 ???:0x0000000000585e90 [/home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python2.7]
  890,479,191 ???:PyFrame_New [/home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python2.7]
  836,574,419 ???:PyObject_GenericGetAttr [/home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python2.7]
  552,808,267 ???:0x000000000049ef00 [/home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python2.7]
  539,318,922 ???:0x0000000000493710 [/home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python2.7]
  488,401,927 ???:_PyType_Lookup [/home/ubuntu/venv/cpython2.7-d217262e7ee7/bin/python2.7]
  258,028,053 ???:PyObject_GetAttr [/home/ubunt...

Louis Bouchard (louis) wrote :

Here is the pastebin for better readability : http://paste.ubuntu.com/24078834/

Louis Bouchard (louis) on 2017-05-10
Changed in python2.7 (Ubuntu):
assignee: Louis Bouchard (louis) → nobody
Joe Gordon (jogo) wrote :

Any updates on this? Are there plans to release a faster python build for Xenial?

Elvis Pranskevichus (elprans) wrote :

After much testing I found what is causing the regression in 16.04 and later. There are several distinct causes which are attributed to the choices made in debian/rules and the changes in GCC.

Cause #1: the decision to compile `Modules/_math.c` with `-fPIC` *and* link it statically into the python executable [1]. This causes the majority of the slowdown. This may be a bug in GCC or simply a constraint, I didn't find anything specific on this topic, although there are a lot of old bug reports regarding the interaction of -fPIC with -flto.

Cause #2: the enablement of `fpectl` [2], specifically the passage of `--with-fpectl` to `configure`. fpectl is disabled in python.org builds by default and its use is discouraged. Yet, Debian builds enable it unconditionally, and it seems to cause a significant performance degradation. It's much less noticeable on 14.04 with GCC 4.8.0, but on more recent releases the performance difference seems to be larger.

Plausible Cause #3: stronger stack smashing protection in 16.04, which uses --fstack-protector-strong, wherease 14.04 and earlier used --fstack-protector (with lesser performance overhead).

Also, debian/rules limits the scope of PGO's PROFILE_TASK to 377 test suites vs upstream's 397, which affects performance somewhat negatively, but this is not definitive. What are the reasons behind the trimming of the tests used for PGO?

Without fpectl, and without -fPIC on _math.c, 2.7.12 built on 16.04 is slower than stock 2.7.6 on 14.04 by about 0.9% in my pyperformance runs [3]. This is in contrast to a whopping 7.95% slowdown when comparing stock versions.

Finally, a vanilla Python 2.7.12 build using GCC 5.4.0, default CFLAGS, default PROFILE_TASK and default Modules/Setup.local consistently runs faster in benchmarks than 2.7.6 (by about 0.7%), but I was not able to pinpoint the exact reason for that.

Note: the percentages above are the relative change in the geometric mean of pyperformance benchmark results.

[1] https://git.launchpad.net/~usd-import-team/ubuntu/+source/python2.7/tree/debian/rules?h=ubuntu/xenial-updates#n421

[2] https://git.launchpad.net/~usd-import-team/ubuntu/+source/python2.7/tree/debian/rules?h=ubuntu/xenial-updates#n117

[3] https://docs.google.com/spreadsheets/d/1L3_gxe-AOYJsXFwGZgFko8jaChB0dFPjK5oMO5T5vj4/edit?usp=sharing

Major Hayden (rackerhacker) wrote :

Thanks for the deep dive, Elvis! :) Is it possible to adjust some of these settings in the Ubuntu packages, or is just the way it will be going forward?

Elvis Pranskevichus (elprans) wrote :

We'll need the package maintainers to chime in on this. Attached is a patch that disables harmful settings.

The attachment "0001-Disable-fpectl-and-fPIC-on-Modules-_math.c.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Matthias Klose (doko) wrote :

thanks for the detailed analysis.

 - #1: I'm stopping now to build the _fpectl module for the upcoming
   17.10 release. I'm hesitant to disable it for 16.04.

 - #2: 2.7.11-6: That's a fix done a year ago, I can't remember
   why I changed that. I'll try to remember ...
   _math.c is mentioned twice as a source file, same as
   timemodule.c

 - #3: if the above change is necessary, then yes, it should only
   be done for the shared builds, not the static ones.

   but starting with 17.04 we are building with -fPIE by default,
   which turns on PIC for everything again. So it is likely that
   you will see a decrease in performance again, unless the
   compiler go a little bit better in newer Ubuntu releases.

I'll look at #2 and try to come up with a non-invasive approach.

Matthias Klose (doko) on 2017-08-31
Changed in python2.7 (Ubuntu Xenial):
status: New → Confirmed
importance: Undecided → High
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python2.7 - 2.7.14~rc1-3ubuntu1

---------------
python2.7 (2.7.14~rc1-3ubuntu1) artful; urgency=medium

  * Regenerate the _PyFPE breaks list for Ubuntu.

 -- Matthias Klose <email address hidden> Tue, 05 Sep 2017 20:19:52 +0200

Changed in python2.7 (Ubuntu):
status: Confirmed → Fix Released
Dariusz Gadomski (dgadomski) wrote :

Matthias, I have made a series of pyperformance benchmarks [1] to compare the influence of the factors listed by Elvis on Xenial and Artful. All runs were done on the same machine (metal) with a fresh Ubuntu cloud image.

My observations confirm that both: changing fpectl and fPIC for _math.c module bring significant improvement over corresponding versions without the changes. I have replaced -fstack-protector-strong with -fstack-protector to observe even better results in the benchmark. Although in the examined scope it's impact is not as significant as the former 2 factors.

The combination of all three factors make the results close to what we can observe on Trusty.

I believe backporting the fpectl and _math.c changes also to Xenial is worth considering.
The -fstack-protector setting brings performance improvement, but it also creates some security doubts.

[1] http://pyperformance.readthedocs.io
[2] https://docs.google.com/spreadsheets/d/1R83NQ7xzIfzFMVdbrh-zqK_iBuPcuhWa6KdTYPibFmE/edit?usp=sharing

Matthias Klose (doko) wrote :

thanks for doing that!

more interesting numbers would be:

 - artful with -fno-PIE -no-pie for the static build
 - xenial with just no_fpic

the reason I'm asking for the latter is that you'll break a lot of packages, needing to rebuild

  $ wc -l debian/pyfpe-breaks.Debian
  70 debian/pyfpe-breaks.Debian

plus you would make every extension in PPA's and third party repositories unusable.

list of breaking packages (version numbers not updated for xenial):

cython (<< 0.26-2.1),
epigrass (<= 2.4.7-1),
invesalius-bin (<= 3.1.1-1),
macs (<= 2.1.1.20160309-1),
printrun (<= 0~20150310-5),
pycorrfit (<= 1.0.0+dfsg-1),
pyscanfcs (<= 0.2.3-3),
python-acora (<= 2.0-2+b1),
python-adios (<= 1.12.0-3),
python-astroml-addons (<= 0.2.2-4),
python-astropy (<= 2.0.1-2),
python-astroscrappy (<= 1.0.5-1+b1),
python-bcolz (<= 1.1.0+ds1-4+b1),
python-breezy (<= 3.0.0~bzr6772-1),
python-bzrlib (<= 2.7.0+bzr6622-7),
python-cartopy (<= 0.14.2+dfsg1-2+b1),
python-cogent (<= 1.9-11),
python-cutadapt (<= 1.13-1+b1),
python-cypari2 (<= 1.0.0-3),
python-dipy-lib (<= 0.12.0-1),
python-djvu (<= 0.8-2),
python-fabio (<= 0.4.0+dfsg-2+b1),
python-falcon (<= 1.0.0-2+b1),
python-fiona (<= 1.7.9-1),
python-fpylll (<= 0.2.4+ds-3),
python-grib (<= 2.0.2-2),
python-gssapi (<= 1.2.0-1+b1),
python-h5py (<= 2.7.0-1+b1),
python-healpy (<= 1.10.3-2+b1),
python-htseq (<= 0.6.1p1-4),
python-imobiledevice (<= 1.2.0+dfsg-3.1),
python-kivy (<= 1.9.1-1+b1),
python-libdiscid (<= 1.0-1+b1),
python-liblo (<= 0.10.0-3+b1),
python-llfuse (<= 1.2+dfsg-1+b1),
python-lxml (<< 3.8.0-2),
python-meliae (<= 0.4.0+bzr199-3),
python-netcdf4 (<= 1.2.9-1+b1),
python-nipy-lib (<= 0.4.1-1),
python-numpy (<< 1:1.12.1-3.1),
python-pandas-lib (<= 0.20.3-1),
python-petsc4py (<= 3.7.0-3+b1),
python-pybloomfiltermmap (<= 0.3.15-0.1+b1),
python-pyfai (<= 0.13.0+dfsg-1+b1),
python-pygame-sdl2 (<= 6.99.12.4-1),
python-pygpu (<= 0.6.9-2),
python-pymca5 (<= 5.1.3+dfsg-1+b1),
python-pymssql (<= 2.1.3+dfsg-1+b1),
python-pyresample (<= 1.5.0-3+b1),
python-pysam (<= 0.11.2.2+ds-3),
python-pysph (<= 0~20160514.git91867dc-4),
python-pywt (<= 0.5.1-1.1+b1),
python-rasterio (<= 0.36.0-2+b2),
python-renpy (<= 6.99.12.4+dfsg-1),
python-scipy (<< 0.18.1-2.1),
python-sfepy (<= 2016.2-2),
python-sfml (<= 2.2~git20150611.196c88+dfsg-4),
python-shapely (<= 1.6.1-1),
python-skimage-lib (<= 0.12.3-9+b1),
python-sklearn-lib (<= 0.19.0-1),
python-specutils (<= 0.2.2-1+b1),
python-statsmodels-lib (<= 0.8.0-3),
python-stemmer (<= 1.3.0+dfsg-1+b7),
python-tables-lib (<= 3.3.0-5+b1),
python-tinycss (<= 0.4-1+b1),
python-tk (<< 2.7.14~rc1-1~),
python-wheezy.template (<= 0.1.167-1.1+b1),
python-yt (<= 3.3.3-2+b1),
sagemath (<= 8.0-5),
xpra (<= 0.17.6+dfsg-1),

Dariusz Gadomski (dgadomski) wrote :

Thanks for the explanation Matthias. I have added the Xenial variant you asked for to the spreadsheet.

The artful will follow once I'm after a couple of days out.

Dariusz Gadomski (dgadomski) wrote :

I have managed to prepare the static build without PIE on top the latest artful version [1].

I have added the results to the same spreadsheet. What I've found particularly interesting are the results of the python_startup & python_startup_no_site tests. In subsequent runs (the result in the spreadsheet was for the second run of the testsuite) the improvement is really significant.

I believe this is thanks to the fact that the relative addresses don't need to be patched before running the binary.

In case of the rest of the tests: there are some small improvements as well as some minor performance decreases.

[1] ppa:dgadomski/pyperf (python2.7 - 2.7.14-2ubuntu3~lp1638695~4)

Dariusz Gadomski (dgadomski) wrote :

Hello Matthias. Is there any progress with applying those features to Xenial? Please let me know if you need any testing to be done.

Tyler Hicks (tyhicks) wrote :

I don't feel like the change from fstack-protector-strong
to fstack-protector should be made. The performance testing results in
the spreadsheet don't suggest that the change positively impacts
performance in a meaningful way. fstack-protector-strong slightly
outperforms fstack-protector in some situations and slightly under
performs in others, suggesting that the difference is within the noise
threshold. I'd strongly prefer that we continue to use
fstack-protector-strong.

Seth Arnold (seth-arnold) wrote :

How long did the benchmarks actually take? The sum of the runtimes appears to be about 11 seconds. Is that correct? Is that long enough to draw useful conclusions from the results?

Thanks

Dariusz Gadomski (dgadomski) wrote :

Seth: those values were somehow calculated from a number of runs. A single pyperformance benchmark run took ~20 minutes and I repeated each of them 3 times.

I still have the 'raw' outputs of pyperformance if needed. From those I see that there are at least 3 values for each test and also there is also time named 'warmup' for each of the tests.

Attaching one of them as example.

Dariusz Gadomski (dgadomski) wrote :

Xenial pyperformance results with -fstack-protector-strong changed to -fstack-protector.

Matthias Klose (doko) on 2017-12-04
description: updated

Hello Major, or anyone else affected,

Accepted python2.7 into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python2.7/2.7.13-2ubuntu0.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python2.7 (Ubuntu Zesty):
status: New → Fix Committed
tags: added: verification-needed verification-needed-zesty
Łukasz Zemczak (sil2100) wrote :

Hello Major, or anyone else affected,

Accepted python2.7 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python2.7/2.7.12-1ubuntu0~16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python2.7 (Ubuntu Xenial):
status: Confirmed → Fix Committed
tags: added: verification-needed-xenial
Matthias Klose (doko) wrote :

_math.o is now built without -fPIC for the static builds.

tags: added: verification-done-zesty
removed: verification-needed-zesty
tags: added: verification-done-xenial
removed: verification-needed-xenial
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python2.7 - 2.7.12-1ubuntu0~16.04.3

---------------
python2.7 (2.7.12-1ubuntu0~16.04.3) xenial-proposed; urgency=medium

  * Some performance improvements: LP: #1638695.
    - Build the _math.o object file without -fPIC for static builds.
  * Rename md5_* functions to _Py_md5_*. Closes: #868366. LP: #1734109.
  * Explicitly use the system python for byte compilation in postinst scripts.
    LP: #1682934.
  * Fix issue #22636: Avoid shell injection problems with
    ctypes.util.find_library(). LP: #1512068.

 -- Matthias Klose <email address hidden> Mon, 04 Dec 2017 15:50:18 +0100

Changed in python2.7 (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for python2.7 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers