Poor mutex and semaphore performance

Bug #1058864 reported by Dave Johansen
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Low
Unassigned

Bug Description

I noticed the results of this blog post:
http://mr-edd.co.uk/blog/sad_state_of_osx_pthread_mutex_t
and I was curious if the same sort of issue existed with the linux kernel and if it would shed any light on Bug #1058854
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1058854

The results don't appear to be as bad as those seen on Mac OS X, but they definitely are a lot poorer than I expected/hoped they would be. The code that builds on Linux is now available on that blog port and I created a git repo with the history here:
https://github.com/daveisfera/test_mutex

Here are the results from running it on the the same HP Pavilion dm4 that as the results from Bug #1058854:
mutex
 1 thread real 0m 0.514s user 0m 0.512s sys 0m 0.000s
 2 threads real 0m 4.382s user 0m 5.176s sys 0m 3.544s
 4 threads real 0m 9.953s user 0m 9.709s sys 0m22.417s
 8 threads real 0m20.984s user 0m14.973s sys 1m 8.256s
16 threads real 0m38.725s user 0m28.426s sys 2m 5.484s

benaphore
 1 thread real 0m 0.324s user 0m 0.312s sys 0m 0.008s
 2 threads real 0m 6.372s user 0m 6.380s sys 0m 5.572s
 4 threads real 0m17.820s user 0m12.273s sys 0m42.495s
 8 threads real 0m40.266s user 0m25.626s sys 2m14.400s
16 threads real 1m40.090s user 0m52.939s sys 5m45.802s

mutex2
 1 thread real 0m 0.325s user 0m 0.320s sys 0m 0.004s
 2 threads real 0m 0.730s user 0m 0.816s sys 0m 0.384s
 4 threads real 0m 2.328s user 0m 3.388s sys 0m 5.704s
 8 threads real 0m 4.096s user 0m 5.328s sys 0m10.705s
16 threads real 0m 7.584s user 0m 9.609s sys 0m20.401s
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu13
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: PCH [HDA Intel PCH], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: dlj 1876 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'PCH'/'HDA Intel PCH at 0xc0800000 irq 50'
   Mixer name : 'Intel CougarPoint HDMI'
   Components : 'HDA:111d76e0,103c1793,00100102 HDA:80862805,103c1793,00100000'
   Controls : 22
   Simple ctrls : 10
DistroRelease: Ubuntu 12.04
InstallationMedia: Ubuntu 12.04.1 LTS "Precise Pangolin" - Release amd64 (20120823.1)
MachineType: Hewlett-Packard HP Pavilion dm4 Notebook PC
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-31-generic root=UUID=162256F02256D479 loop=/hostname/disks/root.disk ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-31.50-generic 3.2.28
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-31-generic N/A
 linux-backports-modules-3.2.0-31-generic N/A
 linux-firmware 1.79.1
StagingDrivers: rts_pstor mei
Tags: precise running-unity staging
Uname: Linux 3.2.0-31-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
dmi.bios.date: 01/17/2012
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: F.08
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 1793
dmi.board.vendor: Hewlett-Packard
dmi.board.version: 41.1C
dmi.chassis.asset.tag: Chassis Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnHewlett-Packard:bvrF.08:bd01/17/2012:svnHewlett-Packard:pnHPPaviliondm4NotebookPC:pvr068E100002204710000622100:rvnHewlett-Packard:rn1793:rvr41.1C:cvnHewlett-Packard:ct10:cvrChassisVersion:
dmi.product.name: HP Pavilion dm4 Notebook PC
dmi.product.version: 068E100002204710000622100
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1058864

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Dave Johansen (davejohansen) wrote : AcpiTables.txt

apport information

tags: added: apport-collected precise running-unity staging
description: updated
Revision history for this message
Dave Johansen (davejohansen) wrote : AlsaDevices.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : AplayDevices.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : BootDmesg.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : CRDA.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : Card0.Codecs.codec.3.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : IwConfig.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : Lspci.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : Lsusb.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : PciMultimedia.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : ProcModules.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : PulseList.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : RfKill.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : UdevDb.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : UdevLog.txt

apport information

Revision history for this message
Dave Johansen (davejohansen) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.6 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. Please only remove that one tag and leave the other tags. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: needs-upstream-testing
Revision history for this message
Dave Johansen (davejohansen) wrote :

I'm running 12.04 so I gave the 3.4 kernel a try ( http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-precise/ ). It appears that the performance is about the same, if not worse, with 3.4.

Here are the results:
mutex
 1 thread real 0m 0.515s user 0m 0.508s sys 0m 0.004s
 2 threads real 0m 4.305s user 0m 4.848s sys 0m 3.728s
 4 threads real 0m12.149s user 0m 9.553s sys 0m33.058s
 8 threads real 0m23.016s user 0m15.909s sys 1m15.709s
16 threads real 0m54.892s user 0m36.570s sys 3m 2.347s

benaphore
 1 thread real 0m 0.319s user 0m 0.316s sys 0m 0.000s
 2 threads real 0m 6.692s user 0m 6.880s sys 0m 5.824s
 4 threads real 0m21.479s user 0m15.269s sys 0m59.448s
 8 threads real 0m43.105s user 0m26.278s sys 2m25.581s
16 threads real 1m22.735s user 0m55.827s sys 4m34.185s

mutex2
 1 thread real 0m 0.322s user 0m 0.316s sys 0m 0.004s
 2 threads real 0m 0.843s user 0m 1.088s sys 0m 0.564s
 4 threads real 0m 2.323s user 0m 3.516s sys 0m 5.648s
 8 threads real 0m 4.184s user 0m 5.576s sys 0m11.049s
16 threads real 0m 8.070s user 0m10.841s sys 0m21.253s

tags: added: kernel-bug-exists-upstream
removed: needs-upstream-testing
Revision history for this message
Dave Johansen (davejohansen) wrote :

I also ran it with the 3.6 kernel ( http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/ ) and just like with 3.4, the performance is about the same, if not worse.

Here are the results:
mutex
 1 thread real 0m 0.545s user 0m 0.520s sys 0m 0.000s
 2 threads real 0m 3.550s user 0m 3.636s sys 0m 3.380s
 4 threads real 0m14.357s user 0m11.073s sys 0m40.387s
 8 threads real 0m29.686s user 0m19.409s sys 1m34.174s
16 threads real 0m59.138s user 0m38.114s sys 3m17.548s

benaphore
 1 thread real 0m 0.318s user 0m 0.316s sys 0m 0.000s
 2 threads real 0m 7.124s user 0m 6.332s sys 0m 7.248s
 4 threads real 0m19.999s user 0m15.173s sys 0m54.255s
 8 threads real 0m41.673s user 0m26.182s sys 2m19.821s
16 threads real 1m17.915s user 0m55.843s sys 4m15.032s

mutex2
 1 thread real 0m 0.319s user 0m 0.312s sys 0m 0.004s
 2 threads real 0m 0.823s user 0m 0.880s sys 0m 0.724s
 4 threads real 0m 2.921s user 0m 3.892s sys 0m 7.612s
 8 threads real 0m 4.272s user 0m 4.984s sys 0m11.709s
16 threads real 0m 8.097s user 0m 9.245s sys 0m22.981s

Revision history for this message
Dave Johansen (davejohansen) wrote :

I installed the kernel with the BFS scheduler from https://launchpad.net/~chogydan/+archive/ppa and ran the test. It appears that the the two schedulers count user/system time differently, but over all it appears that BFS is slightly better than CFS in this test, but doesn't show any significant improve or resolve the underlying issue.

Here are the results:
mutex
 1 thread real 0m 0.511s user 0m 0.509s sys 0m 0.000s
 2 threads real 0m 3.005s user 0m 3.624s sys 0m 2.307s
 4 threads real 0m11.707s user 0m40.539s sys 0m 0.297s
 8 threads real 0m19.150s user 1m 9.498s sys 0m 0.331s
16 threads real 0m34.370s user 2m12.551s sys 0m 0.735s

benaphore
 1 thread real 0m 0.334s user 0m 0.331s sys 0m 0.000s
 2 threads real 0m 4.842s user 0m 9.066s sys 0m 0.162s
 4 threads real 0m17.386s user 0m59.935s sys 0m 0.404s
 8 threads real 0m39.090s user 2m 0.398s sys 0m35.538s
16 threads real 1m16.636s user 3m14.865s sys 1m50.923s

mutex2
 1 thread real 0m 0.317s user 0m 0.315s sys 0m 0.001s
 2 threads real 0m 0.843s user 0m 0.990s sys 0m 0.683s
 4 threads real 0m 2.087s user 0m 1.610s sys 0m 6.571s
 8 threads real 0m 3.648s user 0m 2.663s sys 0m11.841s
16 threads real 0m 6.681s user 0m 4.381s sys 0m22.244s

Revision history for this message
Dave Johansen (davejohansen) wrote :

As another data point, I loaded the Ubuntu 8.04-4 x86 LiveCD and rebuilt the executable and ran the test. I then ran the same executable on the Ubuntu 7.10 x86 LiveCD and the Ubuntu 12.04 x64 Install (after installing ia32-libs) and here are the results. I chose those release because those were the ones that straddled the switch to the Completely Fair Scheduler (CFS). In the mutex and benaphore cases, there definitely appears to be a regression in performance between Ubuntu 7.10 and Ubuntu 8.04. In the mutex2 case, Ubuntu 8.04 seems to outperform 7.10.

By and large, Ubuntu 12.04 seems to be inline with Ubuntu 8.04, but the results in the benaphore case appear to have further regressed.

Here are the actual results:

Ubuntu 7.10
mutex
real 0m 0.671s user 0m 0.672s sys 0m 0.000s
real 0m 3.641s user 0m 4.128s sys 0m 2.476s
real 0m 9.673s user 0m10.845s sys 0m16.433s
real 0m18.666s user 0m18.949s sys 0m45.251s
real 0m41.603s user 0m45.671s sys 1m59.703s

benaphore
real 0m 0.433s user 0m 0.432s sys 0m 0.000s
real 0m 4.441s user 0m 2.912s sys 0m 4.476s
real 0m14.732s user 0m10.349s sys 0m31.250s
real 0m31.096s user 0m23.193s sys 1m24.241s
real 0m54.901s user 0m42.991s sys 2m55.435s

mutex2
real 0m 0.397s user 0m 0.396s sys 0m 0.004s
real 0m 1.154s user 0m 1.668s sys 0m 0.552s
real 0m 3.840s user 0m 8.001s sys 0m 7.244s
real 0m 5.401s user 0m11.845s sys 0m 9.621s
real 0m11.053s user 0m24.422s sys 0m19.685s

Ubuntu 8.04
mutex
real 0m 0.581s user 0m 0.580s sys 0m 0.000s
real 0m 4.565s user 0m 5.696s sys 0m 3.400s
real 0m14.395s user 0m14.137s sys 0m33.518s
real 0m29.346s user 0m24.862s sys 1m31.506s
real 0m50.856s user 0m46.871s sys 2m36.462s

benaphore
real 0m 0.331s user 0m 0.328s sys 0m 0.000s
real 0m 5.629s user 0m 3.340s sys 0m 4.684s
real 0m22.760s user 0m17.581s sys 0m55.155s
real 0m32.677s user 0m27.390s sys 1m42.750s
real 1m 7.969s user 1m 0.220s sys 3m31.481s

mutex2
real 0m 0.402s user 0m 0.404s sys 0m 0.000s
real 0m 0.841s user 0m 0.980s sys 0m 0.548s
real 0m 2.578s user 0m 4.140s sys 0m 6.024s
real 0m 4.329s user 0m 6.556s sys 0m10.573s
real 0m 9.003s user 0m13.417s sys 0m22.329s

Ubuntu 12.04
mutex
real 0m 0.515s user 0m 0.512s sys 0m 0.000s
real 0m 4.001s user 0m 4.820s sys 0m 3.148s
real 0m11.198s user 0m11.669s sys 0m25.246s
real 0m19.938s user 0m16.861s sys 1m 1.536s
real 0m48.183s user 0m37.994s sys 2m33.446s

benaphore
real 0m 0.319s user 0m 0.320s sys 0m 0.000s
real 0m 4.868s user 0m 5.760s sys 0m 3.900s
real 0m19.777s user 0m16.613s sys 0m46.351s
real 0m48.226s user 0m34.098s sys 2m37.834s
real 1m38.550s user 1m10.000s sys 5m22.892s

mutex2
real 0m 0.337s user 0m 0.332s sys 0m 0.000s
real 0m 0.875s user 0m 0.988s sys 0m 0.744s
real 0m 2.082s user 0m 3.124s sys 0m 5.148s
real 0m 4.114s user 0m 5.368s sys 0m10.825s
real 0m 8.069s user 0m10.785s sys 0m21.289s

Revision history for this message
penalvch (penalvch) wrote :

Dave Johansen, could you please provide the full computer model as noted on the sticker (ex. HP Pavilion dm4-2015dx Entertainment Notebook PC)?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Dave Johansen (davejohansen) wrote :

I no longer have access to this hardware to get the full computer model, but the referenced test programs should be able to reproduce the issue on most, if not all machines (it's worked on everyone I've tried it on to date).

penalvch (penalvch)
Changed in linux (Ubuntu):
importance: Medium → Low
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.