[ivb] [WORTMANN TERRA MOBILE ULTRABOOK 1450 II] System freeze after high memory usage

Bug #1137817 reported by rsoika
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

I have a Wortmann Terra Mobile 1450 II:
http://www.wortmann.de/de-de/product/aa_terra_mobile_nb/1220285/terra-mobile-ultrabook-1450-ii.aspx

that randomly system freezes on Ubuntu 12.10 64 bit since I installed my system in Dec. 2012. When the system freezes no mouse , no keyboard, no REISUB is possible. The screen is corrupted and did not update. The only key board functionality which is still possible is Fn+F9 (switch display on/off). I need to switch off the system hard.

The freeze occurs when the system runs on battery as also when it is plugged. When the system freeze occurs with power adapter plugged the system switches off when I disconnect the power adapter - even if the battery is full charged.
But the powermanagement in general seems to work perfect in Ubuntu 12.10.

To me it looks like the system freeze occurs often after high memory usage.
I can use the system with a few applications (Thunderbird / Firefox) for long time (several hours) without any freeze.
But when I start java programming the probability increases that the problem occurs. This means:
 I start Eclipse 4.2, MySQL and GlassFish 3.2 Server. I open a lot of browser windows and read and write much files to disk.
The memory usage in this case increases from less then 1GB to more than 3GB (on total 8GB Ram) - I do not know if this is relevant - but maybe.
I observe the situation for more than two months. As I use my system for work I have freezes one or two times a day - mostly when I have much work ;-) I develop JEE server business applications - so in the situation when the system freezes there is really no heavy graphic load. I am using Gnome Shell.
The freeze seems to be absolutely randomly - I can not reproduce the system freeze.

I can not be sure but it seems to me that since yesterday the system shuts down after some seconds when it feezes. I notice this because I tried to login via ssh to collect the data of the file /sys/kernel/debug/dri/0/i915_error_state. But as the system shuts down I was not able to check the content.
---
ApportVersion: 2.9.2-0ubuntu8
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: ubuntu 4258 F.... pulseaudio
                      ubuntu 14314 F.... pulseaudio
CasperVersion: 1.331
DistroRelease: Ubuntu 13.04
LiveMediaBuild: Ubuntu 13.04 "Raring Ringtail" - Release amd64 (20130424)
MachineType: To be filled by O.E.M. To be filled by O.E.M.
MarkForUpload: True
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: noprompt cdrom-detect/try-usb=true persistent file=/cdrom/preseed/username.seed boot=casper initrd=/casper/initrd.lz quiet splash -- maybe-ubiquity
ProcVersionSignature: Ubuntu 3.8.0-19.29-generic 3.8.8
RelatedPackageVersions:
 linux-restricted-modules-3.8.0-19-generic N/A
 linux-backports-modules-3.8.0-19-generic N/A
 linux-firmware 1.106
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: raring
Uname: Linux 3.8.0-19-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
dmi.bios.date: 09/21/2012
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: X300V TR.2
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: ChiefRiver
dmi.board.vendor: INTEL Corporation
dmi.board.version: To be filled by O.E.M.
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrX300VTR.2:bd09/21/2012:svnTobefilledbyO.E.M.:pnTobefilledbyO.E.M.:pvrTobefilledbyO.E.M.:rvnINTELCorporation:rnChiefRiver:rvrTobefilledbyO.E.M.:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: To be filled by O.E.M.
dmi.product.version: To be filled by O.E.M.
dmi.sys.vendor: To be filled by O.E.M.

Revision history for this message
rsoika (ralph-soika) wrote :
Revision history for this message
rsoika (ralph-soika) wrote :

This is the Xorg.0.log collected after a restart

Revision history for this message
rsoika (ralph-soika) wrote :

This is the Xorg.failsave.log collected after a restart

Revision history for this message
Chris Wilson (ickle) wrote :

As you are already using 12.10, it would be worthwhile to install the v3.8 kernel from raring as that has a number of known IvyBridge fixes - including a few hard hangs (for rather rare GPU issues).

Revision history for this message
rsoika (ralph-soika) wrote :

I have already tried the following 3.8er kernels in the past:
3.8.0
3.8.0-rc3
3.8.0-rc4
3.8.0-rc5
3.8.0-rc7
3.8.0-944_drm-intel-nightly-2013-02-15

But I had no luck. All show the same behavior. That was the reason why I continued with the official version 3.5.0-25.
But from now I will again work wit 3.8.0.
Can I provide any other helpfull data?

Revision history for this message
Chris Wilson (ickle) wrote :

Hmm, that I think should rule out the major problems that we know about. One way to debug some hard hangs is through a netconsole. If you can and it turns out to be a simple case, that would be ideal. However, the usual approach is to look for other oddities and build up hypotheses from there. It's slow progress, and more often than not accidentally fixed in the meantime.

Revision history for this message
rsoika (ralph-soika) wrote :

I installed 3.8.1 now. I will report it if the situation improves.

Revision history for this message
rsoika (ralph-soika) wrote :

After I installed 3.8.1 the systems seems to be much more stable. I worked two days without any freeze. But today I got another freeze :-(
I worked today with an external monitor connected to HDMI when the freeze occures.
This time the screen was not corrupted - just frozen. Also the system did not shutdown like before with the 3.5.0-25. kernel version.
But again I was unable to connect via ssh because the system has no more network connection. So I was unable to collect the missing file /sys/kernel/debug/dri/0/i915_error_state.
To get this file after a freeze and a hard reset, I have now added a script into the /etc/rcS.d directory to copy the file into my home directory during boot time. Please let me know if this makes sense and how I can collect other missing information.

I have also installed the new 3.8.2 kernel version today.
Lets see what happens....

Revision history for this message
Chris Wilson (ickle) wrote :

The likelihood is that that system is completely frozen - if it can't respond to pings, then it is not processing interrupts and completely dead. Often it dies so quickly that it doesn't even manage to send an oops over netconsole.

If the hangs keep on happening, try booting with i915.i915_enable_rc6=0.

bugbot (bugbot)
tags: added: freeze
Revision history for this message
rsoika (ralph-soika) wrote :

Since I installed kernel version 3.8.2 I did not see the freeze I described here. So for the moment I believe (hope) this kernel solves the problem I reported here.
I have had only two unexpected shutdowns when the system idles for long time. But this is a complete different situation as I described. So I did not think this is related.

I will continue watching the system some more days and report when the freeze occurs again.

Can you tell me what the kernel option "i915.i915_enable_rc6=0" is causing? Should I enable the option yet?

Revision history for this message
Chris Wilson (ickle) wrote :

rc6 is a powersaving mode of the GPU, and can roughly save around 9W. However, we have had many reports linking it to GPU hangs and the occasional system hang - hence why it is one of the first things to test.

Revision history for this message
rsoika (ralph-soika) wrote :

Unfortunately I had again a freeze today running 3.8.2 :-(
I have also added the boot option "i915.i915_enable_rc6=0".
The freeze occurred only a few minutes after I booted in the morning. Again I have had a lot of memory usage by starting Eclipse and GlassFish server.
However it feels as if the system is much more stable with kernel version 3.8.2.

Revision history for this message
rsoika (ralph-soika) wrote :

Today I had again two freezes with kernel 3.8.2

Revision history for this message
Chris Wilson (ickle) wrote : Re: [ivb] System freeze after high memory usage

So not rc6 related - that rules out the most likely suspect for the GPU. Let the machine run memtest overnight or over the weekend, just to rule a bad stick of memory. Then if you have VTd enabled, intel_iommu=off will be useful to test, and perhaps pcie_aspm=off.

summary: - System freeze after high memory usage (Ubuntu 12.10 64-bit on ivy bridge
- HD4000)
+ [ivb] System freeze after high memory usage
Revision history for this message
rsoika (ralph-soika) wrote :

many thanks for your help.
I have done memtests two times before I posted this issue. There were no defects detected.
What did you mean with '..if you have VTd enabled'?

I will now trying the following boot option:
GRUB_CMDLINE_LINUX="i8042.noloop intel_iommu=off"

it can take some days until I will post the results. mostly the freeze now occurs only when I use the external monitor in my office. Working with internal laptop display only the system seems to be quite stable. But I had also one freeze some days ago with in this situation.

Revision history for this message
Chris Wilson (ickle) wrote :

VTd is the 'virtualised device' acceleration within the CPU - it has been very problematic with the igfx so far, but supposedly Ivybridge works. intel_iommu=off disables support and thus allows the GPU to have direct access to memory without going through a DMA remapper.

Revision history for this message
rsoika (ralph-soika) wrote :

I had no luck. Also with intel_iommu=off and pcie_aspm=off I had a freeze again yesterday and today. I used the following boot option:

GRUB_CMDLINE_LINUX="i8042.noloop intel_iommu=off pcie_aspm=off"

I recognized that the system reboots after some seconds. Did you think that we have a chance to collect some useful data during the reboot?

I will now try kernel version v3.9-rc3-raring

Revision history for this message
Chris Wilson (ickle) wrote :

The system reboots? Interesting, that implies a panic and not necessarily an outright hard hang. Do you have a wired network connection and could setup netconsole?

Revision history for this message
rsoika (ralph-soika) wrote :

Yes I had a wired network connection, but I could not connect via ssh. I seems that the net work is also lost immediate when the system freezes. So I can do nothing, except wait for the reboot.

Revision history for this message
rsoika (ralph-soika) wrote :

Today I had again a total freeze (kernel 3.8.3).
And this time, I have observed the system in more detail.
When the system freezes I have not chance to do any keyboard input or a remote connect/ping/ssh.
After 10 minutes the system reboots. And 10 minutes is exactly my current setting to dim the screen to save power and for going in standby mode in case when no power is plugged in. But in all cases when the system freezes the power is plugged in.

I hope this is a valuable clue. When the system is able to reboot can you write a script or something to collect data? Should I provide any data from the system log?

Revision history for this message
rsoika (ralph-soika) wrote :

I had a second freeze today. Now without power supply. In this situation the system did not reboot after 10 minutes. So I had a hard shut-down.
It seems to me, that for my system kernel version 3.8.3 is more unstable as version 3.8.2.

Revision history for this message
rsoika (ralph-soika) wrote :

I have now observed the following behavior:
With kernel version 3.8.3 I have had frequent freezes. The screen is garbled and the system will reboot automatically.
With version 3.8.2, the system is much more stable. In a freeze (which is very rare) the screen is not garbled only frozen and the system reboots after 10 minutes.
The version 3.8.2 is for me much more better than the previous version or as Kernel version 3.8.3.

Revision history for this message
Chris Wilson (ickle) wrote :

Gut instinct tells me that the garbage display is a secondary issue that occurs after we try to reset the GPU. It is either the lead up to the GPU reset (i.e. the hang) or the actual attempt at reseting that is likely to be the cause of the hard hang.

Can you please try i915.reset=0 (with either 3.8.2 or 3.8.3) and see if we can then get any logs from the death?

Revision history for this message
rsoika (ralph-soika) wrote :

ok I am using now the following options:

GRUB_CMDLINE_LINUX="i8042.noloop intel_iommu=off pcie_aspm=off i915.reset=0"

What should I do in case of a freeze to get the log data you need?

Revision history for this message
Chris Wilson (ickle) wrote :

You will likely have to log in remotely (though it quite possible that a virtual terminal CRTL-ATL-F1 will still work) and grab the dmesg/Xorg.0.log/i915_error_state or you can try running 'apport-collect 1137817'

Revision history for this message
rsoika (ralph-soika) wrote :

Adding the kernel option i915.reset=0 has brought no improvement.
I still have now several freezes once a day. (maybe more often as in the past)
I was not able to virtual terminal (CRTL-ATL-F1) or reach the system from another pc remote.
The behavior was the same with kernel 3.8.2 and 3.8.3
And also now the system did no longer have the situation that it reboots automatically. It stays frozen.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Confirmed
Revision history for this message
rsoika (ralph-soika) wrote :

My system is still freezing. I am working with 3.9.0-rc4 and using kernel option i915.reset=0.
The system is rebooting after 10 minutes (in most cases). But maybe I have now some new interesting information:

I wrote a stupid script added into my crontab. The script is telling me every 3 minutes that everything is ok. I want to check if the system can log anything during the freeze.
Now I have had a freeze at 11:17. The system reboots at 11:27.
This is the syslog: You can see that my cronjob did not run at 11:15. The last entry was 11:09 and not 11:15 as expected.
What are the jobs "anacron" and "cracklib" for? Can that be an issue?

Apr 1 11:00:01 ralphs-laptop CRON[3649]: (rsoika) CMD (/home/rsoika/Tools/freeze_monitor.sh)
Apr 1 11:00:01 ralphs-laptop logger: hello ralph it is 11:00:01 and I am still alive :-)
Apr 1 11:03:01 ralphs-laptop CRON[3987]: (rsoika) CMD (/home/rsoika/Tools/freeze_monitor.sh)
Apr 1 11:03:01 ralphs-laptop logger: hello ralph it is 11:03:01 and I am still alive :-)
Apr 1 11:04:54 ralphs-laptop kernel: [ 3691.000385] [Firmware Bug]: battery: (dis)charge rate invalid.
Apr 1 11:04:54 ralphs-laptop anacron[4060]: Anacron 2.3 started on 2013-04-01
Apr 1 11:04:54 ralphs-laptop anacron[4060]: Will run job `cron.daily' in 5 min.
Apr 1 11:04:54 ralphs-laptop anacron[4060]: Will run job `cron.weekly' in 10 min.
Apr 1 11:04:54 ralphs-laptop anacron[4060]: Jobs will be executed sequentially
Apr 1 11:06:01 ralphs-laptop CRON[4171]: (rsoika) CMD (/home/rsoika/Tools/freeze_monitor.sh)
Apr 1 11:06:01 ralphs-laptop logger: hello ralph it is 11:06:01 and I am still alive :-)
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.699618] CPU0: Package power limit notification (total events = 1)
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.699622] CPU2: Package power limit notification (total events = 1)
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.699626] CPU1: Package power limit notification (total events = 1)
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.699628] CPU3: Package power limit notification (total events = 1)
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.705797] CPU2: Package power limit normal
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.705799] CPU0: Package power limit normal
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.705820] CPU3: Package power limit normal
Apr 1 11:07:19 ralphs-laptop kernel: [ 3835.705821] CPU1: Package power limit normal
Apr 1 11:09:01 ralphs-laptop CRON[4383]: (rsoika) CMD (/home/rsoika/Tools/freeze_monitor.sh)
Apr 1 11:09:01 ralphs-laptop logger: hello ralph it is 11:09:01 and I am still alive :-)
Apr 1 11:09:54 ralphs-laptop anacron[4060]: Job `cron.daily' started
Apr 1 11:09:54 ralphs-laptop anacron[4405]: Updated timestamp for job `cron.daily' to 2013-04-01
Apr 1 11:10:12 ralphs-laptop cracklib: no dictionary update necessary.

Revision history for this message
rsoika (ralph-soika) wrote :

I got still the randomly freezes. Currently I am using kernel 3.9.0-RC6.

Revision history for this message
rsoika (ralph-soika) wrote :

Would it make sense to go back to 12.04 LTS or do a fresh install with 13.04 now?

Revision history for this message
Chris Wilson (ickle) wrote :

Both have known bugs - I would have said 13.04 had fewer but you are experiencing a pretty severe issue with 13.04. As there is no obvious reason for the freezes, I would not have expected them to have been resolved by now.

Revision history for this message
rsoika (ralph-soika) wrote :

I want to describe the problem again, as I've watched it in the last weeks:
I have freezes with kernel 3.8 and 3.9. It seems that the 3.8.8 is more stable - but the system still freezes randomly. In very rare cases, the system reboots immediately after the freeze (one of 10 freezes). I am using the kernel option i915.reset=0.
I have tested the following kernel versions : 3.8.1, 3.8.2, 3.8.3, 3.8.4, 3.8.8 and 3.9-rc1/rc4/rc5/rc6/rc8.

I can't see any remarkable error messages in the log files and I am unable to collect any data when the system freezes.

Revision history for this message
Chris Wilson (ickle) wrote :

One random freeze could be:

commit 0920a48719f1ceefc909387a64f97563848c7854
Author: Stéphane Marchesin <email address hidden>
Date: Tue Jan 29 19:41:59 2013 -0800

    drm/i915: Increase the RC6p threshold.

    This increases GEN6_RC6p_THRESHOLD from 100000 to 150000. For some
    reason this avoids the gen6_gt_check_fifodbg.isra warnings and
    associated GPU lockups, which makes my ivy bridge machine stable.

That kernel is available in ppa:mainline under drm-intel-nightly.

Revision history for this message
Chris Wilson (ickle) wrote :

But that should have also been tested by i915.i915_enable_rc6=0. :|

Revision history for this message
rsoika (ralph-soika) wrote :

today I found the following error messages in /var/log/dmesg.
maybe helpful ?

[ 2.572983] [drm] Initialized drm 1.1.0 20060810
[ 2.590193] ACPI Warning: 0x0000000000000428-0x000000000000042f SystemIO conflicts with Region \PMIO 1 (20121018/utaddress-251)
[ 2.590201] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[ 2.590205] ACPI Warning: 0x0000000000000530-0x000000000000053f SystemIO conflicts with Region \GPIO 1 (20121018/utaddress-251)
[ 2.590208] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[ 2.590210] ACPI Warning: 0x0000000000000500-0x000000000000052f SystemIO conflicts with Region \GPIO 1 (20121018/utaddress-251)
[ 2.590213] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

Revision history for this message
rsoika (ralph-soika) wrote :

I tested kernel versions 3.8.9, 3.8.10, 3.8.11 with no success.

Attached I send a screenshot of a freeze, so you can imagine the problem better.
The picture is not absolute static but flickers partially.
After 10 minutes, the system reboots. But I still have no way to collect any data.

I am currently using the kernel options: GRUB_CMDLINE_LINUX="i8042.noloop i915.i915_enable_rc6=0"

Revision history for this message
rsoika (ralph-soika) wrote :

I tried kernel version 3.9.2 and official 3.5.0-30 with no success. I have still the random freezes.
Because of this posting: http://www.thinkwiki.org/wiki/Intel_HD_Graphics
I tried also the kernel options:

i915.semaphores=1 i915.i915_enable_fbc=0

but also with no effect.

Revision history for this message
rsoika (ralph-soika) wrote :

I still had no luck. I am now working with the drm-intel-nightly builds. But the situation is always the same: suddenly after 1 up to 5 hours the system freezes. No chance to get any response from the system. After some minutes the system shuts down.

Revision history for this message
rsoika (ralph-soika) wrote :

I think now that the freeze is triggered by a temperature problem.
We have here in Germany at the time a relatively high external temperature of 30 ° Celsius and freezes take significantly.
Is there a way to verify this aspect?

Revision history for this message
Chris Wilson (ickle) wrote :

There should be a package for monitoring CPU (and system) temperatures. (Or you can search /sys.) If the CPU overheats it will begin to throttle and, if need be, shut itself down before it is damaged. Those events will be recorded in the syslog.

penalvch (penalvch)
affects: xserver-xorg-video-intel (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → Medium
penalvch (penalvch)
tags: added: needs-apport-collect needs-upstream-testing precise regression-potential
tags: added: quantal
penalvch (penalvch)
tags: added: needs-crash-log
tags: added: raring
rsoika (ralph-soika)
tags: added: kernel-bug-exists-upstream kernel-bug-exists-upstream-3.11.0-031100rc7-generic
removed: needs-upstream-testing
penalvch (penalvch)
tags: added: kernel-bug-exists-upstream-v3.11-rc7 needs-upstream-testing
removed: kernel-bug-exists-upstream kernel-bug-exists-upstream-3.11.0-031100rc7-generic
rsoika (ralph-soika)
tags: added: kernel-bug-exists-upstream-v3.11-0
removed: kernel-bug-exists-upstream-v3.11-rc7
29 comments hidden view all 109 comments
Revision history for this message
rsoika (ralph-soika) wrote : Lsusb.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
rsoika (ralph-soika) wrote : ProcCpuinfo.txt

apport information

description: updated
Revision history for this message
rsoika (ralph-soika) wrote : AlsaInfo.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : BootDmesg.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : CRDA.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : CurrentDmesg.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : Dependencies.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : IwConfig.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : Lspci.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : Lsusb.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : ProcInterrupts.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : ProcModules.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : PulseList.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : UdevDb.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : UdevLog.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : WifiSyslog.txt

apport information

Revision history for this message
rsoika (ralph-soika) wrote : Re: [ivb] System freeze after high memory usage

I notice that there seems to be also problem with my realtek network card. During kernel updates I see messages complaining about missing firmeware.
Would it be possible that the freezes are from that area?

penalvch (penalvch)
tags: removed: needs-apport-collect
description: updated
penalvch (penalvch)
description: updated
summary: - [ivb] System freeze after high memory usage
+ [ivb] [WORTMANN TERRA MOBILE ULTRABOOK 1450 II] System freeze after high
+ memory usage
penalvch (penalvch)
tags: added: kernel-bug-exists-upstream-v3.11
removed: kernel-bug-exists-upstream-v3.11-0
tags: removed: needs-upstream-testing
1 comments hidden view all 109 comments
Revision history for this message
rsoika (ralph-soika) wrote :
Download full text (3.5 KiB)

I started with 12.10 and later I upgrated to 10.40.
I have also tested 12.04 32bit in the mean time (see comment #53)
I think the kind of freezes were over all the time always the same.
I am now running on Debian 7.1 which shows the same freeze with default kernel 3.2.
So I updated yesterday to the kernel from the backports - version 3.10-0.bp0.2.

During my excursion to Debian I saw that there is a missing firmware (this was also logged under ubuntu but it did look to me that it was not a problem so I did not investigate into that. Debian complained about the missing firmeware when I try to install a kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/ , and forced me to uninstall it when I opened synaptic.
So I was unable to install the kernels in a successful way.

Now finally I succeeded with the backport version 3.10-0 by providing also the package firmware-iwlwifi_0.40_all.deb

The system log looks now (for me) cleaner. I see messages like this:

Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.763834] [drm] Initialized drm 1.1.0 20060810
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.768513] cfg80211: Calling CRDA to update world regulatory domain
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.777166] Intel(R) Wireless WiFi driver for Linux, in-tree:
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.777169] Copyright(c) 2003-2013 Intel Corporation
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.781881] [drm] Memory usable by graphics device = 2048M
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.784463] iwlwifi 0000:01:00.0: firmware: agent loaded iwlwifi-2030-6.ucode into memory
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.784661] iwlwifi 0000:01:00.0: loaded firmware version 18.168.6.1 op_mode iwldvm
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.802459] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 1.802461] [drm] Driver supports precise vblank timestamp query.
....
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.611346] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.611349] i915 0000:00:02.0: registered panic notifier
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.635952] acpi device:49: registered as cooling_device9
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.647893] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no)
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.648007] input: Video Bus as /devices/LNXSYSTM:00/device:00/PNP0A08:00/LNXVIDEO:00/input/input8
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.648275] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.671144] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1b.0/sound/card0/input9
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.671265] input: HDA Intel PCH Headphone as /devices/pci0000:00/0000:00:1b.0/sound/card0/input10
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.671381] input: HDA Intel PCH Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input11
Sep 12 22:40:36 ralpus-ultrabook kernel: [ 2.971332] Adding 10157052k swap on /dev/sda...

Read more...

Revision history for this message
penalvch (penalvch) wrote :

For regression testing purposes, could you please test for this in Lucid via http://old-releases.ubuntu.com/releases/lucid/ ?

Revision history for this message
rsoika (ralph-soika) wrote :

I made a short test with 10.04 from a live cd. The system works but I can't tell you if would work stable for many hours.

I have reinstalled my system with debian 7.1 and running kernel 3.10-0.bpo.2-amd64. Since this upgrade I have no freeze. But I think I need to monitor it for a longer time.

Revision history for this message
penalvch (penalvch) wrote :

rsoika, of course the Lucid test would need to happen as 2-3 times as long as it would typically take to reproduce this problem on your new one to test out if this issue is indeed a regression.

Revision history for this message
rsoika (ralph-soika) wrote :

For now I can not do the test with Ubuntu 10.04. I am sorry for that.
But maybe I have an interesting new info:
As I told you I run debian 7.1 with backport kernel 3.10-0.bpo.2-amd64 now since 6 days. So far I had no freezes with the new kernel version from the backports!

But today I installed the flash player as described here:
http://www.cyberciti.biz/faq/debian-linux-7-wheezy-install-flash-player/

I need this for the web-conferencing software http://www.spreed.com.
And some minutes after my meeting the system freezes again as it did the last 8 month every day several times.

Could the freeze depend on the flash-player??

Revision history for this message
rsoika (ralph-soika) wrote :

when the system freezes I also see now the following log messages which are flooding the log:

Sep 15 12:21:44 ralpus-ultrabook kernel: [ 6332.536072] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:22:14 ralpus-ultrabook kernel: [ 6362.621660] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:22:44 ralpus-ultrabook kernel: [ 6392.707251] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:23:14 ralpus-ultrabook kernel: [ 6422.792837] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:23:44 ralpus-ultrabook kernel: [ 6452.878429] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:24:14 ralpus-ultrabook kernel: [ 6482.964012] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:24:44 ralpus-ultrabook kernel: [ 6513.049602] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:25:15 ralpus-ultrabook kernel: [ 6543.135189] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:25:45 ralpus-ultrabook kernel: [ 6573.220776] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:26:15 ralpus-ultrabook kernel: [ 6603.306365] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:26:45 ralpus-ultrabook kernel: [ 6633.391953] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:27:15 ralpus-ultrabook kernel: [ 6663.477544] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
Sep 15 12:27:45 ralpus-ultrabook kernel: [ 6693.563128] mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING

Revision history for this message
rsoika (ralph-soika) wrote :
Revision history for this message
rsoika (ralph-soika) wrote :

Would it be worth trying to put the i915 on the blacklist?
And if yes - how should I do that?

Revision history for this message
rsoika (ralph-soika) wrote :

It seems that with kernel version 3.12 the problem is nearly fixed.
Since I upgraded to version 3.12-RC7 and 3.12.0 I had only 1 freeze during the last two weeks.

I did not know what changed in the kernel in this version. I no longer think that the problem is related to the GPU but to the WIFI.
Is this possible? I found this discussion which looks to me similar to my own description:
https://bbs.archlinux.org/viewtopic.php?id=168122

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
rsoika (ralph-soika)
Changed in linux (Ubuntu):
status: Expired → New
Revision history for this message
rsoika (ralph-soika) wrote :

Hi,
I just want to tell my experience with Kernel Version 3.12 and 3.13.
As I already reported, the system becomes more and more stable since I used Kernel Version 3.12.
At least with kernel version 3.12.5 and 3.12.7 I hat no freeze since 4 weeks.
One day ago I installed Kernel Version 3.13.0 - As a result I had two freezes in one day!

So maybe this is helpful to find out what could be different in the new kernel version causing the freezes again?
Now I am back on version 3.12.7.

Revision history for this message
rsoika (ralph-soika) wrote :

freezes back again in kernel version 3.13.0

tags: added: kernel-bug-exists-upstream-v3.13.0 trusty
removed: apport-collected kernel-bug-exists-upstream-v3.11 needs-crash-log precise quantal raring regression-potential
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: raring
Revision history for this message
penalvch (penalvch) wrote :

rsoika, just to clarify, given a kernel call trace/x.org backtrace is yet to be produced, could you please advise on if your problem is correlated to overheat by monitoring your temperatures via https://help.ubuntu.com/community/SensorInstallHowto ?

tags: added: apport-collected needs-crash-log precise quantal
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
rsoika (ralph-soika) wrote :

I am using the gnome system monitor to watch my system temperature. I don't think its a overheat problem. I thought this in the past (see comment #46) that my problem was related to temperature. But I am sure we can sort this out.
With kernel 3.12.8 I have no freezes. I do a lot of java development and there I run a lot of glassfish servers with heavy cpu load and much memory consumption. This did not bring down my system - it is very very stable - which makes me happy :-)

After I installed the new kernel 3.13.0 I only opened a browser window (no java, no eclipse, no glassfish) and the system freezes two times in one hour. The system seems to be back crazy as in all kernel versions I tried before 3.11.
So I uninstalled kernel 3.13.0 and go back to 3.12.8.

Revision history for this message
penalvch (penalvch) wrote :

rsoika, thank you for performing the requested action. Just to clarify:
1) When you note 3.12.8, you are referring to http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12.8-trusty/ ?
2) When you note 3.13.0 you are referring to http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13-trusty/ ?
3) It would be best to focus on the point of 3.12.8 onwards, and treat this problem as a regression. Hence, the next step is to fully commit bisect from 3.12.8 to 3.12.9, in order to identify the offending commit. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection ?

tags: added: kernel-bug-exists-upstream-3.13
removed: kernel-bug-exists-upstream-v3.13.0
tags: added: needs-bisect
Revision history for this message
rsoika (ralph-soika) wrote :

1) yes
2) yes
3) kernelBisection sounds really hard to me. I have git experience but no experience in building kernel versions from sources :-/

Did you expect that if I install the 3.12.9 I should see the same behavior as in 3.13.0 ? So is it helpful if I try 3.12.9 or is it necessary to compile a specific kernel version from the sources now?

Revision history for this message
rsoika (ralph-soika) wrote :

hi,
after all I can say with high probability that the freeze no longer exists in kernel version 3.12.8.
But in version 3.12.9 the freeze is back to a frequency that I saw in versions 3.9 / 3.10.
So maybe its a starting point for a KernelBisection :-/ (I did not completely understand how to build a ubuntu kernel from sources)

I want to ask first if you can make a guess what the error causes? The change log from 3.12.8 to 3.12.9 is not very long so maybe we can sort it out with some kernel options?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
rsoika (ralph-soika) wrote :

It looks as I fixed the problem by changing the overclocking of memory in BIOS. It was set to 'AUTO' and I changed the setting to one of the lowest values.

Revision history for this message
rsoika (ralph-soika) wrote :

To add additional information abut the BIOS changes:

The setting is named: Memory Frequency Limiter.
Possible Values are AUTO, 1067, 1600, 1867 .. up to 2667

I used 1067 to solve the problem. I will test other values the next weeks

Displaying first 40 and last 40 comments. View all 109 comments or add a comment.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.