#680514 xserver-xorg-video-intel: X locks up with EQ overflow

Package:
xserver-xorg-video-intel
Source:
xserver-xorg-video-intel
Description:
X.Org X server -- Intel i8xx, i9xx display driver
Submitter:
John Goerzen
Date:
2014-12-11 19:00:10 UTC
Severity:
important
#680514#5
Date:
2012-07-06 13:28:47 UTC
From:
To:
This laptop had been running squeeze for some time with no issues at
all.  It had also been running wheezy for some time with no issues.

Lately, however, it has taken to locking up every few days.  When it
does so, the mouse cursor can still move, but that's about it.  The
keyboard doesn't work -- not even to switch to a different virtual
terminal.  I can still ssh into the system, but X of course has hung.
This makes X just about unusable.

When it starts, I see these lines in Xorg.0.log:

[314627.653] [mi] EQ overflowing.  Additional events will be discarded
until existing events are processed.

followed by...

[314627.684] [mi] These backtraces from mieqEnqueue may point to a
culprit higher up the stack.
[314627.684] [mi] mieq is *NOT* the cause.  It is a victim.
[314628.933] [mi] EQ overflow continuing.  100 events have been
dropped.

This is on a Lenovo Thinkpad T420s.

I have hoped that various kernel and X upgrades over the last month or
two would have helped, but they have not.  The problem has perhaps
become a little less frequent, but that's it.

Even restarting gdm does not appear to rescue the system in this
situation.  A reboot is always required.


Dear Maintainer,
*** Please consider answering these questions, where appropriate ***

   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
     ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these lines ***

#680514#14
Date:
2012-07-12 10:37:30 UTC
From:
To:
severity 680514 important
tag 680514 moreinfo
kthxbye
This is an old version.
So is this.

And this includes a bunch of "please give me pain" options.

Is the problem reproducible with all of the above fixed?

Cheers,
Julien

#680514#25
Date:
2012-07-12 13:46:06 UTC
From:
To:
found 680514 2:2.19.0-4
thanks

Julien,

I have duplicated this on an up-to-date wheezy install.  I have not
tried without the kernel parameters.  I can try that and see what
happens.  It could be hours between freezes, or days; it's hard to predict.

Those options are all related to power management and are fairly
innocuous from all my reading.  They're clearly documented out there and
see mto be generally recommended on laptops.  Moreover, they had been
working fine in the past.

#680514#30
Date:
2012-07-13 15:10:54 UTC
From:
To:
Julien,

I reproduced this bug on current wheezy without the troublesome boot
parameters.

Linux minerva 3.2.0-3-amd64 #1 SMP Thu Jun 28 09:07:26 UTC 2012 x86_64
GNU/Linux


jgoerzen@minerva:~$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-3.2.0-3-amd64 root=/dev/mapper/minerva-root ro
quiet pcie_aspm=force i915.lvds_downclock=1

Does this meet your requirements?

thanks,

#680514#35
Date:
2013-03-16 10:39:30 UTC
From:
To:
Hello,

I have started seeing this bug after upgrading kernel to linux 3.2.39-* series
of packages. I have never seen this issue with linux-image-3.2.0-4-amd64
3.2.35-2 (which I have downgraded to now) or earlier.

With 3.2.39-2 X on this Dell Latitude E5420 laptop locks up quite predictably
at least twice a day which makes it basically unusable. Given 3.2.39-2 has
already migrated to wheezy, I see this as pontentially big issue.

#680514#40
Date:
2013-03-16 11:19:43 UTC
From:
To:
This bug (680514) was reported against earlier versions, so there's a
good chance yours is different.  Please file a separate bug.  Also
include dmesg from when that happens, and i915_error_state from debugfs
(https://01.org/linuxgraphics/documentation/how-get-gpu-error-state).

Cheers,
Julien

#680514#45
Date:
2013-03-27 11:28:37 UTC
From:
To:
Hi,

I've a Dell Latitude E6520

And I can report the very same issue.

It just happened to my laptop.
I logged in from another laptop in my LAN (ssh) and killed X, then
restarted gdm3 service.

I don't know if that can help but I can attach my current dmesg, the
i915_error_state of my graphic card. Hopefully this will be helpful, if
you need the i915_error_state extracted before recovering (with kill X +
restart) I'll extract it when it happen again and send it to you.

# uname -a
Linux mastroc3 3.2.0-4-amd64 #1 SMP Debian 3.2.39-2 x86_64 GNU/Linux


Regards
Daniele

#680514#50
Date:
2013-04-01 14:46:42 UTC
From:
To:
Hi,

i am having the same problem i think. I have a HD3000 IGPU on a i5-2500K Desktop CPU and i'm running Debian Wheezy with KDE.

The way to reproduce the issue is to click or change very fast the screensaver in the KDE-Settings. You have to change the OpenGL- Screensavers very fast by clicking on the names of them. After a few clicks (5-15) the Xserver freezes and i'm only able to move the mouse, nothing more.

Regards
Bentallica
 
 

#680514#55
Date:
2013-04-03 16:00:58 UTC
From:
To:
Hi,
 
as i have written yesterday, i can reproduce this issue by clicking on the names of the OpenGL-Screensavers in the KDE Settings.
When I do that, X freezes and i only can move the mouse. In the  "Xorg.0.log.old" are the same problems with "[mi] EQ overflow continuing" like in the files from message #45.
So far, so bad.
 
Another thing i have found out is that with the old kernel 3.2.35-2-amd64 there are no problems with this issue.
So maybe it is related to #703276 bug? http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=703276
 
Regards
Bentallica
 
 

#680514#60
Date:
2013-04-11 13:01:03 UTC
From:
To:
I experience the bug anywhere from xserver-xorg-video-intel 2:2.19.0-6
and linux-image-3.2.0-4-amd64 in wheezy to xserver-xorg-video-intel
2:2.20-14-1 and linux-image-3.8-trunk-amd64 (3.8.5-1~experimental.1) in
experimental.

I never experience it on the laptop alone (Lenovo X220), only on my two
external DisplayPort monitors via the docking station (Lenovo ThinkPad
Mini Dock Plus Series 3).

I usually have some symptoms before the lock up happens, when one or
both external monitors just go black. It can go black and then come
back, one or more times, or it can just stay black. It does not power
off and the desktop still behaves as if two monitors are connected. If I
change the monitor arrangement (on/off) in my display properties I can
get the monitor back on. These symptoms can be present without the EQ
overflow error message or any other error message, like in the attached
files.

All issues appear more often when using Gnome3 OpenGL effects like
pressing the windows key, or by watching flash videos (vimeo/youtube/..)
in my browser.

#680514#65
Date:
2013-04-14 17:24:42 UTC
From:
To:
Running

3.2.0-4-amd64 #1 SMP Debian 3.2.41-2 x86_64 GNU/Linux,
linux   /boot/vmlinuz-3.2.0-4-amd64
root=UUID=36b94ea6-4bdd-4adc-8637-5aa1cc21a3af ro  quiet
xserver-xorg-video-intel              2:2.19.0-6

Screen frequently hangs. Sometimes switches over to black and then comes
back 'working' again but unaccelerated.

#680514#70
Date:
2013-05-07 10:10:29 UTC
From:
To:
These two bugs, 703276 and 680514 have still not been solved in wheezy
and they are a great problem for a stable debian release...

#680514#75
Date:
2013-05-07 20:24:41 UTC
From:
To:
Send patches?

Cheers,
Julien

#680514#80
Date:
2013-05-08 05:19:56 UTC
From:
To:
Can't code, but I'm doing my share helping debian with other ways.
Anyway, I did not demand anything, but it strikes me odd that such
problem which makes a desktop system pretty useless was not solved
before releasing wheezy.

George

#680514#85
Date:
2013-05-15 02:20:29 UTC
From:
To:
Hi, guys, how are you?

On February, 2 I sent the e-mail below, about a problem I'm having with
Debian on my Lenovo laptop.

As suggested, I upgraded my kernel to 3.7 (the current trunk kernel at
that moment), but it didn't solve the problem.

So I returned to kernel 3.2 (my wifi didn't work on 3.7), and continued
living with the problem. Once a week, sometimes once a day, and
sometimes twice a day, the system freeze and I need to restart manually.
The issue is more critical because I use the laptop at the job.

But yesterday the system froze again, and I spent 3 hours scanning the
entire /var/log directory, hoping to find any message that could help me
understand the problem.

So I finally found it! My problem is related to this bug[1], though I
found the message "EQ overflowing" in file /var/log/gdm3/:0.log, not in
/var/log/Xorg.0.log.

I'm so glad I found this bug report, which is exactly the problem I'm
facing, and other people are facing this problem too. It means that it
may take time, but a solution will be found someday.

So I'm here to tell you guys that, though my experience with Linux is
not that much, I'm a developer (currently working with C# since 2008,
but spent three years with C before that) and I would be happy to help
you solve this bug :)

Regards,

Will

[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=680514

#680514#90
Date:
2013-05-20 13:22:07 UTC
From:
To:
I've just experienced this issue on a Lenovo T420s with Intel GPU, see the
attached Xorg.log for backtraces.

In /var/log/messages I got only this:
May 20 13:50:54 lenovo kernel: [12850.409525] Watchdog[7894]: segfault at
0 ip 00007f5f370b8efa sp 00007f5f2b9e1560 error 6 in
chrome[7f5f363d8000+5513000]
May 20 13:51:14 lenovo kernel: [12870.764899] Watchdog[8003]: segfault at
0 ip 00007f7ea42ebefa sp 00007f7e98c14560 error 6 in
chrome[7f7ea360b000+5513000]

However I'd like to add that I've also experienced the same problem
already on a completely different machine with an *nVidia* GTX560 card
using the proprietary binary drivers from Debian repos (and yes, the
backtrace looked the same from what I remember, with the difference of the
driver .so). From the bugs reported on a similar issue it seems that it
happens on a Nouveau driver as well so I have a feeling the problem lie
deeper in the Xorg or kernel itself, not just the driver.

A couple of things to add from the moment both machines crashed:
 - both running XFCE desktop
 - both having a Google Chrome opened on a page with a Flash content


Some info about the Intel-based Lenovo machine:

# uname -a
Linux lenovo 3.2.0-4-amd64 #1 SMP Debian 3.2.41-2+deb7u2 x86_64 GNU/Linux

# dpkg-query -l|grep xorg
ii  xorg                                  1:7.7+2
 amd64        X.Org X Window System
ii  xorg-docs-core                        1:1.6-1
 all          Core documentation for the X.org X Window System
ii  xorg-sgml-doctools                    1:1.10-1
 all          Common tools for building X.Org SGML documentation
ii  xserver-xorg                          1:7.7+2
 amd64        X.Org X server
ii  xserver-xorg-core                     2:1.12.4-6
 amd64        Xorg X server - core server
ii  xserver-xorg-input-all                1:7.7+2
 amd64        X.Org X server -- input driver metapackage
ii  xserver-xorg-input-evdev              1:2.7.0-1+b1
 amd64        X.Org X server -- evdev input driver
ii  xserver-xorg-input-synaptics          1.6.2-2
 amd64        Synaptics TouchPad driver for X.Org server
ii  xserver-xorg-input-wacom              0.15.0+20120515-2
 amd64        X.Org X server -- Wacom input driver
ii  xserver-xorg-video-apm                1:1.2.3-3
 amd64        X.Org X server -- APM display driver
ii  xserver-xorg-video-ark                1:0.7.4-1+b1
 amd64        X.Org X server -- ark display driver
ii  xserver-xorg-video-fbdev              1:0.4.2-4+b3
 amd64        X.Org X server -- fbdev display driver
ii  xserver-xorg-video-intel              2:2.19.0-6
 amd64        X.Org X server -- Intel i8xx, i9xx display driver
rc  xserver-xorg-video-openchrome         1:0.2.906-2
 amd64        X.Org X server -- VIA display driver
rc  xserver-xorg-video-radeon             1:6.14.4-8
 amd64        X.Org X server -- AMD/ATI Radeon display driver
ii  xserver-xorg-video-vesa               1:2.3.1-1+b1
 amd64        X.Org X server -- VESA display driver

#680514#95
Date:
2013-05-20 13:26:43 UTC
From:
To:
One more log from Xorg report script.
#680514#100
Date:
2013-05-21 12:49:20 UTC
From:
To:
Please don't hijack somebody else's unrelated bug.

Thanks,
Julien

#680514#105
Date:
2013-05-22 16:50:29 UTC
From:
To:

It has been reported according to http://www.debian.org/Bugs/Reporting
and is 100% related to the original bugreport
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=680514).

Have a nice day.

#680514#110
Date:
2013-05-24 06:12:13 UTC
From:
To:
Dear Maintainer,

I too have found the EQ Overflow message in /var/log/gdm3 directory,
after a lockup this morning. Recently I find I experience the problem quite frequently
(once every other day) normally relatively soon (within 10 minutes) of
booting.  Once that period is over it hardly ever causes a problem

I too am running a configuration with Dual Monitors. This seems to
have been a factor in other reports to this bug.

When this situaton occurs both screens freeze, but I am able to ssh
into the machine from outside and run in a console.

Sometimes I can clear the problem by restarting gdm3 - although the
crash today wouldn't allow that and (since I am ssh'ing in with a tablet
computer with limited screen and couldn' diagnose further) I had to
reboot to clear.

I have been slowly transitioning from sid, via testing (wheezy) to
stable (wheezy), over the past 6 months or so and the frequency of
this problem increased about the time wheezy was officially released
as stable (in so much as one of the reasons for the transition was to
improve resiliance of my system, and whilst wheezy was testing, any
failures were infrequent (no more than once every 2 weeks)

I have tried to attach the log file to this report, but report bug
keeps telling me it can't find the file

#680514#115
Date:
2013-05-29 08:10:05 UTC
From:
To:
Hi,

I was able to see this in dmesg:

[  142.604535] pool[5272]: segfault at 58 ip 00007fe62b230758 sp
00007fe5e7fedf78 error 4 in libpangoft2-1.0.so.0.3000.0[7fe62b21d000+2a000]
[  143.700151] usbcore: registered new interface driver snd-usb-audio
[  383.469423] eclipse[6097] trap divide error ip:7f852764b2d0
sp:7fff75f2a070 error:0 in libgtk-x11-2.0.so.0.2400.18[7f8527522000+42a000]
[  482.970628] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[  482.970633] [drm] capturing error event; look for more information in
/debug/dri/0/i915_error_state


I attach another error state log.


Is there someone who can give some direction on how to debug the issue? My
desktop freeze at least 3 times a day.

#680514#120
Date:
2013-06-26 12:30:08 UTC
From:
To:
Dear Maintainer,

Have have now found a way to consistently (within a matter of seconds) trigger this bug

Open up a libreoffice spreadsheet to maximized size on my larger monitor
Fill the screen up with a number in each cell (fill in the first cell, then drag right - then drag the selected row down)

Scroll rapidly from left to right (either direction triggers the problem)

NOTE scrolling vertically does not trigger the problem.

NOTE it doesn't appear to trigger the problem when the cells are empty

#680514#125
Date:
2013-07-02 07:11:08 UTC
From:
To:
Since filing the report that suggests that horizontal scrolling with
LibreOffice calc would almost immediately cause this bug to trigger, I
have two further points to remark

1) Doing the same test in gnumeric does not fail - and I tried quite
hard to make it do so

2) I have had two lockups almost immediately after opening a document in
libreoffice writer and attempting vertical scrolling.  I think this is
significant because
a) I have had no other lockups from any other applications
b) This is almost the only time I have opened a document recently.

I presume libreoffice has some specific way it is driving the x-server
which makes it affect the server in this way.

#680514#130
Date:
2013-07-03 02:53:26 UTC
From:
To:
Dear Maintainer,
I'm experiencing this bug every now and then after a wakeup from
standby on my Lenovo t420s. I'm not using two monitors.

Reportbug told me about the version in experimental, but I'm a bit
hesitant to try upgrading to that on my wheezy installation. Would this
be useful to try though?

Thanks for your hard work,
~David

#680514#135
Date:
2013-09-29 21:59:58 UTC
From:
To:
Hi,

I experience this bug on my Asus U36D. I can also confirm Alan Chandler's note
about libreoffice. This bug occured 3 (of 4) times when I opened particular
document in libreoffice. The last time it occured when I was opening youtube
in chromium browser, Flash was active and probably used.

I have upgraded kernel from 3.2.46-1 to 3.2.46-1+deb7u1 shortly before
experiencing this bug, but I'm not sure if this is only a coincidence.

#680514#140
Date:
2014-04-17 22:54:07 UTC
From:
To:
This bug is indeed not only related to intel, but nvidia and ati as
well. With fglrx 14.3 and X.org 1.14-5 I've had the same EQ overflow
error and subsequent crash.

Those affected should try updating to 1.15, or patching xorg-server.

http://forums.gentoo.org/viewtopic-p-7475096.html#7475096

http://cgit.freedesktop.org/xorg/xserver/commit/?id=0492deb8f8238b7782e5a706ec6219d88aa1091d

#680514#145
Date:
2014-12-11 18:56:49 UTC
From:
To:
Hi there,

I'm experiencing this issue over one month now and I'm running an up to date
jessie system (kernel 3.16.0-4-amd64, xserver-xorg-video-intel 2:2.21.15-2+b2,
xserver-xorg-core 2:1.16.1.901-1).

I'm not able to find any trigger for the lock-up, the bug just randomly appears
"out of the blue". Once I also found my laptop "frozen" this way in the
morning; the system was idle for at least several hours when the lock-up
occured.

As reported earlier, it is still possible to SSH into the machine; what I think
was not reported is that it is _sometimes_ possible to revive the system by
executing pm-suspend via SSH - the screen goes off for a while, turns
immediately back on and everything works perfectly again. Sometimes, however,
the pm-suspend succeeds and xserver remains frozen even after waking up, no
matter how many times I try executing pm-suspend. In the latter case I also
noticed that it takes more time than usual to finish suspending.

Any idea what should I look for when it happens again? I'm running into this
bug few times a week so I could try to learn more, but the logs do not seem to
reveal anything useful..