mgag200 driver hangs on HP ProLiant Gen 8 platform

Bug #1042903 reported by Narinder Gupta on 2012-08-28
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Incomplete
Medium
linux (Ubuntu)
Medium
Leann Ogasawara
Quantal
Medium
Leann Ogasawara
Raring
Medium
Leann Ogasawara

Bug Description

Quantal OS hangs on HP ProLiant Gen8 Platform. after installing quantal reboot the system and see the hang. After triaging it we found theta new mgag200 driver gets built in Quantal was causing the hang. After blacklist the mgag200 driver system boots up without any issue. Precise works fine as it does not include the mgag200 driver.

visibility: private → public
affects: hp → linux
Changed in linux:
assignee: nobody → Leann Ogasawara (leannogasawara)
importance: Undecided → Medium
status: New → In Progress

After an extensive series of tests, we've determined that the following kernel config change is the root cause of the issue:

commit 682f607fd4a57d273db9ecc360589a8fa9cec8e6
Author: Andy Whitcroft <email address hidden>
Date: Fri Jul 20 21:33:43 2012 +0100

    UBUNTU: [Config] Enable CONFIG_DRM_AST/_CIRRUS_QEMU/_MGAG200

The above config change enabled the mgag200 driver which is described as follows:

config DRM_MGAG200
        tristate "Kernel modesetting driver for MGA G200 server engines"
        depends on DRM && PCI && EXPERIMENTAL
        select FB_SYS_FILLRECT
        select FB_SYS_COPYAREA
        select FB_SYS_IMAGEBLIT
        select DRM_KMS_HELPER
        select DRM_TTM
        help
         This is a KMS driver for the MGA G200 server chips, it
         does not support the original MGA G200 or any of the desktop
         chips. It requires 0.3.0 of the modesetting userspace driver,
         and a version of mga driver that will fail on KMS enabled
         devices.

This driver is marked as EXPERIMENTAL. Our default config policy is to enable device specific experimental options to promote testing and feedback. However, should any experimental option produce adverse affects, we will disable them. As such, I've built a test kernel with CONFIG_DRM_MGAG200 disabled and we have confirmed this resolves the issue [1]. I've committed this change to the Quantal kernel git repository. It should be included in the subsequent 3.5.0-13.14 or newer upload. This however may unfortunately not land in time for the Quantal Beta-1 release. However, it will land well in advance of 12.10 being final. A temporary workaround aside from running the test kernel [1] is to blacklist the mgag200 driver.

[1] http://people.canonical.com/~ogasawara/lp1039178/

affects: linux → linux (Ubuntu)
Changed in linux (Ubuntu):
status: In Progress → Fix Committed

I have added a Bugzilla at kernel.org and here is the bugzilla details.

*Bug 46591* <https://bugzilla.kernel.org/show_bug.cgi?id=46591> - Matrox
g200 driver mgag200 hangs on HP ProLiant Gen8 Platform (edit
<https://bugzilla.kernel.org/post_bug.cgi#>)

On 08/28/2012 02:28 PM, Leann Ogasawara wrote:
> After an extensive series of tests, we've determined that the following
> kernel config change is the root cause of the issue:
>
> commit 682f607fd4a57d273db9ecc360589a8fa9cec8e6
> Author: Andy Whitcroft <email address hidden>
> Date: Fri Jul 20 21:33:43 2012 +0100
>
> UBUNTU: [Config] Enable CONFIG_DRM_AST/_CIRRUS_QEMU/_MGAG200
>
> The above config change enabled the mgag200 driver which is described as
> follows:
>
> config DRM_MGAG200
> tristate "Kernel modesetting driver for MGA G200 server engines"
> depends on DRM && PCI && EXPERIMENTAL
> select FB_SYS_FILLRECT
> select FB_SYS_COPYAREA
> select FB_SYS_IMAGEBLIT
> select DRM_KMS_HELPER
> select DRM_TTM
> help
> This is a KMS driver for the MGA G200 server chips, it
> does not support the original MGA G200 or any of the desktop
> chips. It requires 0.3.0 of the modesetting userspace driver,
> and a version of mga driver that will fail on KMS enabled
> devices.
>
> This driver is marked as EXPERIMENTAL. Our default config policy is to
> enable device specific experimental options to promote testing and
> feedback. However, should any experimental option produce adverse
> affects, we will disable them. As such, I've built a test kernel with
> CONFIG_DRM_MGAG200 disabled and we have confirmed this resolves the
> issue [1]. I've committed this change to the Quantal kernel git
> repository. It should be included in the subsequent 3.5.0-13.14 or newer
> upload. This however may unfortunately not land in time for the Quantal
> Beta-1 release. However, it will land well in advance of 12.10 being
> final. A temporary workaround aside from running the test kernel [1] is
> to blacklist the mgag200 driver.
>
> [1] http://people.canonical.com/~ogasawara/lp1039178/
>
> ** Project changed: linux => linux (Ubuntu)
>
> ** Changed in: linux (Ubuntu)
> Status: In Progress => Fix Committed
>

--
Thanks and Regards,
Narinder Gupta (PMP) <email address hidden>
Technical Account Manager
Canonical, Ltd. narindergupta [irc.freenode.net]
+1.281.736.5150 narindergupta2007[skype]

Ubuntu- Linux for human beings | www.ubuntu.com | www.canonical.com

Changed in linux:
importance: Unknown → Medium
status: Unknown → Confirmed

I wanted to relay the following here in the bug report...I was notified of a commit in the latest upstream v3.5.3 release. I thought it would be interesting to test the latest Ubuntu 3.5.0-13.13 which was rebased to this latest v3.5.3 upstream stable release. Narinder and I didn't originally test this latest Ubuntu kernel as I only uploaded it yesterday. Anyways, I see the following commit included there:

commit be67f30591ffe86e250955645a97e3d68f5a2cdf
Author: Dave Airlie <email address hidden>
Date: Thu Aug 9 15:00:15 2012 +1000

    drm/mgag200: fix G200ER pll picking algorithm

    commit 9830605d4c070b16ec5c24a75503877cc7698409 upstream.

    The original code was misported from the X driver,

    a) an int went to unsigned int, breaking the downward counting testm code
    b) the port did the vco/computed clock bits completely wrong.

    This fixes an infinite loop on modprobe on some Dell servers with the G200ER
    chipset variant.

    Found in internal testing.

    Signed-off-by: Dave Airlie <email address hidden>
    Signed-off-by: Greg Kroah-Hartman <email address hidden>

So the issue which was being seen might be the fix with the above and thus allow us to keep the mgag200 driver enabled. I figured it wouldn't hurt to test anyways. Thanks.

Narinder Gupta (narindergupta) wrote :

HP offshore team tried 3.5.0-13.13 kernel and they were seeing the same
problem.

On 08/28/2012 04:24 PM, Leann Ogasawara wrote:
> I wanted to relay the following here in the bug report...I was notified
> of a commit in the latest upstream v3.5.3 release. I thought it would
> be interesting to test the latest Ubuntu 3.5.0-13.13 which was rebased
> to this latest v3.5.3 upstream stable release. Narinder and I didn't
> originally test this latest Ubuntu kernel as I only uploaded it
> yesterday. Anyways, I see the following commit included there:
>
> commit be67f30591ffe86e250955645a97e3d68f5a2cdf
> Author: Dave Airlie <email address hidden>
> Date: Thu Aug 9 15:00:15 2012 +1000
>
> drm/mgag200: fix G200ER pll picking algorithm
>
> commit 9830605d4c070b16ec5c24a75503877cc7698409 upstream.
>
> The original code was misported from the X driver,
>
> a) an int went to unsigned int, breaking the downward counting testm code
> b) the port did the vco/computed clock bits completely wrong.
>
> This fixes an infinite loop on modprobe on some Dell servers with the G200ER
> chipset variant.
>
> Found in internal testing.
>
> Signed-off-by: Dave Airlie <email address hidden>
> Signed-off-by: Greg Kroah-Hartman <email address hidden>
>
> So the issue which was being seen might be the fix with the above and
> thus allow us to keep the mgag200 driver enabled. I figured it wouldn't
> hurt to test anyways. Thanks.
>

--
Thanks and Regards,
Narinder Gupta (PMP) <email address hidden>
Technical Account Manager
Canonical, Ltd. narindergupta [irc.freenode.net]
+1.281.736.5150 narindergupta2007[skype]

Ubuntu- Linux for human beings | www.ubuntu.com | www.canonical.com

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.5.0-13.14

---------------
linux (3.5.0-13.14) quantal; urgency=low

  [ Leann Ogasawara ]

  * [Config] Disable CONFIG_DRM_MGAG200
    - LP: #1042903

  [ Upstream Kernel Changes ]

  * [media] uvcvideo: Reset the bytesused field when recycling an erroneous
    buffer
    - LP: #1042809
 -- Tim Gardner <email address hidden> Tue, 28 Aug 2012 08:43:55 -0400

Changed in linux (Ubuntu Quantal):
status: Fix Committed → Fix Released
Changed in linux:
status: Confirmed → Incomplete
Changed in linux (Ubuntu Raring):
status: Fix Released → In Progress
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.8.0-0.4

---------------
linux (3.8.0-0.4) raring; urgency=low

  [ Leann Ogasawara ]

  * [Config] Update CONFIG_TOUCHSCREEN_EGALAX build annotation
  * [Config] Update CONFIG_IIO build annotation
  * [Config] Update CONFIG_TOUCHSCREEN_EETI annotation
  * [Config] Remove CONFIG_SPI_DW_MMIO annotation
  * [Config] Remove CONFIG_SPI_PL022 annotation
  * [Config] Update CONFIG_EZX_PCAP annotation
  * [Config] Update CONFIG_SENSORS_AK8975 annotation
  * [Config] Disable CONFIG_DRM_MGAG200
    - LP: #1042903
 -- Leann Ogasawara <email address hidden> Mon, 14 Jan 2013 10:01:50 -0800

Changed in linux (Ubuntu Raring):
status: In Progress → Fix Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.