x86, amd: Avoid cache aliasing penalties on AMD family 15h

Bug #862583 reported by Tim Gardner
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Tim Gardner
Oneiric
Fix Released
Undecided
Tim Gardner
Precise
Fix Released
Undecided
Tim Gardner

Bug Description

This patch provides performance tuning for the "Bulldozer" CPU. With its
    shared instruction cache there is a chance of generating an excessive
    number of cache cross-invalidates when running specific workloads on the
    cores of a compute module.

    This excessive amount of cross-invalidations can be observed if cache
    lines backed by shared physical memory alias in bits [14:12] of their
    virtual addresses, as those bits are used for the index generation.

    This patch addresses the issue by clearing all the bits in the [14:12]
    slice of the file mapping's virtual address at generation time, thus
    forcing those bits the same for all mappings of a single shared library
    across processes and, in doing so, avoids instruction cache aliases.

    It also adds the command line option "align_va_addr=(32|64|on|off)" with
    which virtual address alignment can be enabled for 32-bit or 64-bit x86
    individually, or both, or be completely disabled.

    This change leaves virtual region address allocation on other families
    and/or vendors unaffected.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 862583

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Tim Gardner (timg-tpi) wrote :

We have recently discovered an issue with Bulldozer instruction cache
which may cause certain loads to experience performance regressions.
http://marc.info/?l=linux-kernel&m=131134057927055&w=3 thread discusses
the problem and the fix.

We have patches queued for 3.2, they should also apply to 3.0:

1. http://git.kernel.org/tip/dfb09f9b7ab03fd367740e541a5caf830ed56726
2. http://git.kernel.org/tip/a110b5ec7371592eac856ac5c22dc7b518952d44
3. http://git.kernel.org/tip/8fa8b035085e7320c15875c1f6b03b290ca2dd66

and then

http://git.kernel.org/tip/9387f774d61b01ab71bade85e6d0bfab0b3419bd

We would very much like these patches to get into 11.10. As soon as
kernel.org is revived, please consider integrating them.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

SRU Justification

Impact: Some workloads cause an excessive number of cache cross-invalidates

Patch description: Avoid cache aliasing penalties on AMD family 15h

Changed in linux (Ubuntu Oneiric):
assignee: nobody → Tim Gardner (timg-tpi)
status: Incomplete → In Progress
Revision history for this message
Herton R. Krzesinski (herton) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-oneiric' to 'verification-done-oneiric'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-oneiric
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Tested on the affected CPU. It does not appear to cause any regressions, though I'll have to depend on the expertise of the AMD engineers as to whether it improves performance.

tags: added: verification-done-oneiric
removed: verification-needed-oneiric
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.8 KiB)

This bug was fixed in the package linux - 3.0.0-13.22

---------------
linux (3.0.0-13.22) oneiric-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #884847

  [ Herton Ronaldo Krzesinski ]

  * Revert "SAUCE: Add a new entry (413c:8197) to Bluetooth USB device ID
    table"

linux (3.0.0-13.21) oneiric-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #876701

  [ Leann Ogasawara ]

  * Revert "SAUCE: ata: make DVD drive recognisable on systems with
    Sandybridge CPT chipset"
    - LP: #737388, #782389, #794642
  * SAUCE: drm/radeon/kms: Fix logic error in DP HPD handler
    - LP: #860868

  [ Ming Lei ]

  * SAUCE: [media] uvcvideo: Set alternate setting 0 on resume if the bus
    has been reset
    - LP: #816484
  * SAUCE: ata_piix: make DVD Drive recognisable on systems with Intel
    Sandybridge chipsets(v2)
    - LP: #737388, #782389, #794642

  [ Seth Forshee ]

  * SAUCE: acer-wmi: Add wireless quirk for Lenovo 3000 N200
    - LP: #857297

  [ Tim Gardner ]

  * SAUCE: Add a new entry (413c:8197) to Bluetooth USB device ID table
    - LP: #854399
  * [Config] Enable ftrace support in the mac80211 layer
    - LP: #865171
  * SAUCE: usb/core/devio.c: Check for printer class specific request
    - LP: #872711
  * SAUCE: xHCI: AMD isoc link TRB chain bit quirk
    - LP: #872811

  [ Upstream Kernel Changes ]

  * Revert "rt2x00: Serialize TX operations on a queue."
    - LP: #868628
  * Revert "rt2x00: fix crash in rt2800usb_write_tx_desc"
    - LP: #868628
  * Revert "rt2x00: fix crash in rt2800usb_get_txwi"
    - LP: #868628
  * Revert "rt2x00: Move rt2800_txdone and rt2800_txdone_entry_check to
    rt2800usb."
    - LP: #868628
  * Revert "sfc: Use write-combining to reduce TX latency" and follow-ups
    - LP: #868628
  * Revert "drm/radeon/kms: fix typo in r100_blit_copy"
    - LP: #868628
  * x86, amd: Avoid cache aliasing penalties on AMD family 15h
    - LP: #862583
  * x86: Add a BSP cpu_dev helper
    - LP: #862583
  * x86, amd: Move BSP code to cpu_dev helper
    - LP: #862583
  * x86-32, amd: Move va_align definition to unbreak 32-bit build
    - LP: #862583
  * Make TASKSTATS require root access, CVE-2011-2494
    - LP: #866021
    - CVE-2011-2494
  * kernel/printk: do not turn off bootconsole in printk_late_init() if
    keep_bootcon
    - LP: #868628
  * rapidio: fix use of non-compatible registers
    - LP: #868628
  * arch/powerpc/sysdev/fsl_rio.c: correct IECSR register clear value
    - LP: #868628
  * ASoC: soc-jack: Fix checking return value of request_any_context_irq
    - LP: #868628
  * ASoC: ad193x: fix registers definition
    - LP: #868628
  * ASoC: ad193x: fix dac word len setting
    - LP: #868628
  * omap-serial: Allow IXON and IXOFF to be disabled.
    - LP: #868628
  * serial: 8250_pnp: add Intermec CV60 touchscreen device
    - LP: #868628
  * 8250_pci: add support for Rosewill RC-305 4x serial port card
    - LP: #868628
  * 8250: Fix race condition in serial8250_backup_timeout().
    - LP: #868628
  * tty: Add "spi:" prefix for spi modalias
    - LP: #868628
  * TTY: pty, fix pty counting
    - LP: #868628
  * USB: ftdi_sio: add Calao r...

Changed in linux (Ubuntu Oneiric):
status: In Progress → Fix Released
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Precise):
status: In Progress → Fix Released
Revision history for this message
Shahar Or (mightyiam) wrote :

Could this be causing framedrops / display stuttering / display lags ?

It is a long shot but this is my only lead so far as I've changed my CPU to the FX-4100 and the display started frame dropping since.

I won't give here more details. It's all in the Bug #927918. I'll just say that I'm at a loss... Perhaps a BIOS bug...

Thanks and Blessings,
Shahar

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.