#776911 gnome-shell: crashes with "Failed to create texture 2d" after "[drm:i8xx_irq_handler] *ERROR* pipe A underrun" #776911
- Package:
- xserver-xorg-video-intel
- Source:
- xserver-xorg-video-intel
- Description:
- X.Org X server -- Intel i8xx, i9xx display driver
- Submitter:
- Rafal Pietrak
- Date:
- 2021-09-22 04:27:51 UTC
- Severity:
- important
- Tags:
right after upgrade from wheezy to jessie I'm not able to login with standard gdm3 login panel. It takes my username and password, and tries to start gnome session, but fails with "Opps" screen. The only hints I find in logs are "drm:i9xx_set_fifo_underrun *ERROR* pipe A underrun".
Control: tags 776911 + moreinfo
The "Oops" screen means that something marked as critical to the session
has not started, or has crashed repeatedly. There are only two critical
components in jessie's GNOME: gnome-shell, and gnome-settings-daemon.
Is there any indication in the systemd journal, /var/log/syslog,
or /var/log/Xorg.*.log of what has crashed or why? Please attach
the parts of those logs that are close to the time of the failure to
start GNOME.
Are you using the standard GNOME-Shell-based gdm login screen?
(grey background with a "texture", black bar at the top)
Are you using the standard GNOME Shell login session, or a variant
like Classic or Flashback?
The information gathered by these commands might also be useful
information for developers:
reportbug --template gnome-settings-daemon
reportbug --template gnome-shell
reportbug --template xorg
Thanks,
S
Please reply to the bug, not to me personally, so that other
developers can look at the bug log and see all the relevant
information.
<https://bugs.debian.org/775235> and
<https://bugs.debian.org/770130> on which some debugging has
already been done.
Michael, I see you've found a solution or workaround: is there
anything you'd like Rafal to try?
Probably not necessary now, since the Cogl error was enough to
link this to other bug reports; but for future reference, sending
attachments (with gzip applied if large) to Debian bugs
works fine.
S
control: forcemerge 775235 -1 Try gnome-shell built with llvm-3.4 instead of 3.5. Best wishes, Mike
control: forcemerge 775235 -1 Try gnome-shell built with llvm-3.4 instead of 3.5. Best wishes, Mike
W dniu 22.02.2015 o 03:46, Michael Gilbert pisze: Where do I find it?
W dniu 22.02.2015 o 03:46, Michael Gilbert pisze: Where do I find it?
W dniu 22.02.2015 o 03:46, Michael Gilbert pisze: Where do I find it?
Adding mesa@packages to Cc since I suspect #775235 needs reassigning to Mesa, perhaps along with its merged bugs #770130, #776911. Context for Mesa maintainers: #775235 is that gnome-shell crashes in an i386 VM with its default choice of emulated CPU, with "LLVM ERROR: Do not know how to split the result of this operator!". See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#10 for full backtrace. The merged bug #770130 is that gnome-shell crashes on an unknown i386 (probably relatively old, since it has 82830M/MG integrated graphics, which Wikipedia says is a Pentium III-M chipset) with "Cogl-ERROR **: Failed to create texture 2d due to size/format constraints". Oddly, this only happens for non-power-of-two textures in the gnome-shell run by the actual user, not by the gnome-shell used for the gdm login prompt (even though both use non-power-of-two textures). Michael Gilbert merged this with #775235 and #776911 without comment; I'm not sure of the reasoning for believing that #775235 and #770130 is the same thing. The other merged bug #776911 is that gnome-shell crashes on an unknown i386 with "Cogl-ERROR **: Failed to create texture 2d due to size/format constraints"; that looks like the same thing as #770130. gnome-shell doesn't build-depend on any llvm version so I don't think that's going to make any difference. Do you mean mesa built with llvm-3.4 instead of 3.5, as seen in <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#15>? Looking at the Ubuntu changelog, it seems that they rebuilt mesa with llvm-3.5, then reverted to llvm-3.4 because "too many regressions" (e.g. https://bugs.launchpad.net/ubuntu/+source/llvm-toolchain-3.5/+bug/1360241), but then switched to llvm-3.6 for Mesa 10.4 in 15.04/vivid. Mesa 10.4.2 seems to have been accidentally uploaded to unstable instead of experimental. I infer from the git repository that 10.3.2 (via t-p-u) is still the version targeted for jessie: is this correct? Would use of llvm-3.4 for jessie be acceptable to the Mesa maintainers? According to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#15 that fixes at least one of these merged bugs, perhaps all three. Alternatively, would it be useful information for the Mesa maintainers if people tried Mesa 10.4.2 on affected systems? #775235 should be easy to reproduce on an i386 VM on an amd64 host (I assume that's how Steve did the original failed test), but #770130 and #776911 seem to require access to real hardware. I have a Pentium IV system which I might be able to resurrect if that would help, although if the relevant CPU feature is SSE2, I think that's too new. S
Adding mesa@packages to Cc since I suspect #775235 needs reassigning to Mesa, perhaps along with its merged bugs #770130, #776911. Context for Mesa maintainers: #775235 is that gnome-shell crashes in an i386 VM with its default choice of emulated CPU, with "LLVM ERROR: Do not know how to split the result of this operator!". See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#10 for full backtrace. The merged bug #770130 is that gnome-shell crashes on an unknown i386 (probably relatively old, since it has 82830M/MG integrated graphics, which Wikipedia says is a Pentium III-M chipset) with "Cogl-ERROR **: Failed to create texture 2d due to size/format constraints". Oddly, this only happens for non-power-of-two textures in the gnome-shell run by the actual user, not by the gnome-shell used for the gdm login prompt (even though both use non-power-of-two textures). Michael Gilbert merged this with #775235 and #776911 without comment; I'm not sure of the reasoning for believing that #775235 and #770130 is the same thing. The other merged bug #776911 is that gnome-shell crashes on an unknown i386 with "Cogl-ERROR **: Failed to create texture 2d due to size/format constraints"; that looks like the same thing as #770130. gnome-shell doesn't build-depend on any llvm version so I don't think that's going to make any difference. Do you mean mesa built with llvm-3.4 instead of 3.5, as seen in <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#15>? Looking at the Ubuntu changelog, it seems that they rebuilt mesa with llvm-3.5, then reverted to llvm-3.4 because "too many regressions" (e.g. https://bugs.launchpad.net/ubuntu/+source/llvm-toolchain-3.5/+bug/1360241), but then switched to llvm-3.6 for Mesa 10.4 in 15.04/vivid. Mesa 10.4.2 seems to have been accidentally uploaded to unstable instead of experimental. I infer from the git repository that 10.3.2 (via t-p-u) is still the version targeted for jessie: is this correct? Would use of llvm-3.4 for jessie be acceptable to the Mesa maintainers? According to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#15 that fixes at least one of these merged bugs, perhaps all three. Alternatively, would it be useful information for the Mesa maintainers if people tried Mesa 10.4.2 on affected systems? #775235 should be easy to reproduce on an i386 VM on an amd64 host (I assume that's how Steve did the original failed test), but #770130 and #776911 seem to require access to real hardware. I have a Pentium IV system which I might be able to resurrect if that would help, although if the relevant CPU feature is SSE2, I think that's too new. S
Hi, Simon McVittie wrote (26 Feb 2015 08:54:10 GMT) : Tails is affected by that bug (https://labs.riseup.net/code/issues/8778); in case it may help asserting the severity, comment #16 there (and some further ones, particularly #19 and #22) have additional data about what exact vcpu model and features are affected. I've rebuilt the mesa package with the patch from https://freedesktop.org/patch/34445/, and it fixes things for me, at least with `-cpu qemu32'. I've asked the other Tails developer (Cc'd), who is also experiencing this bug, to try and reproduce my results. => I believe at least one of these bugs should be reassigned to the mesa package. I'll let the maintainers decide about that and about the potential RC-ness. If it helps, I can also do a more complete series of comparative tests with various vcpus. I suspect that the release team will find that rebuilding mesa with a different compiler is more invasive and risky, at this stage of the release cycle, than applying a relatively simple patch. (The good news is that we might get a release team member's opinion for free while asking to the mesa maintainers :) I can do that if it helps, although I suspect it's too way late to let that one migrate to testing. Cheers, -- intrigeri
Hi, Simon McVittie wrote (26 Feb 2015 08:54:10 GMT) : Tails is affected by that bug (https://labs.riseup.net/code/issues/8778); in case it may help asserting the severity, comment #16 there (and some further ones, particularly #19 and #22) have additional data about what exact vcpu model and features are affected. I've rebuilt the mesa package with the patch from https://freedesktop.org/patch/34445/, and it fixes things for me, at least with `-cpu qemu32'. I've asked the other Tails developer (Cc'd), who is also experiencing this bug, to try and reproduce my results. => I believe at least one of these bugs should be reassigned to the mesa package. I'll let the maintainers decide about that and about the potential RC-ness. If it helps, I can also do a more complete series of comparative tests with various vcpus. I suspect that the release team will find that rebuilding mesa with a different compiler is more invasive and risky, at this stage of the release cycle, than applying a relatively simple patch. (The good news is that we might get a release team member's opinion for free while asking to the mesa maintainers :) I can do that if it helps, although I suspect it's too way late to let that one migrate to testing. Cheers, -- intrigeri
The patch to mesa is definitely an improvement but some qemu CPU types still fail. The host system has an AMD FX-6100 CPU. . ------------------------------------------------------- These did not work without the patch and they're still not working. * SandyBridge * Broadwell * Haswell * Westmere * Nehalem * Penryn These worked previously and still work after the patch. * host * kvm64 * Opteron_G1 * Opteron_G2 * Opteron_G3 * Opteron_G4 * Opteron_G5 * phenom These failed previously but they work with the patched mesa. * core2duo * coreduo * Conroe * n270 * qemu32 I'll attach logs later.
The patch to mesa is definitely an improvement but some qemu CPU types still fail. The host system has an AMD FX-6100 CPU. . ------------------------------------------------------- These did not work without the patch and they're still not working. * SandyBridge * Broadwell * Haswell * Westmere * Nehalem * Penryn These worked previously and still work after the patch. * host * kvm64 * Opteron_G1 * Opteron_G2 * Opteron_G3 * Opteron_G4 * Opteron_G5 * phenom These failed previously but they work with the patched mesa. * core2duo * coreduo * Conroe * n270 * qemu32 I'll attach logs later.
Sounds like there are at least two different issues that should be split... Cheers, Julien
Sounds like there are at least two different issues that should be split... Cheers, Julien
...
Those are all rather modern Intel CPUs. It would not surprise me at all
if they had CPU features that are not supported by your host system: AMD
CPUs usually support the latest version of AMD-originated features and a
somewhat older version of Intel-originated features, and vice versa.
You can check via /proc/cpuinfo on the host and the VM. For instance,
here's the Sandy Bridge CPU on my laptop:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc
aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer
aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow
vnmi flexpriority ept vpid
If your host CPU is missing any of those features, then emulating a
SandyBridge CPU in your VM is probably an invalid configuration.
S
...
Those are all rather modern Intel CPUs. It would not surprise me at all
if they had CPU features that are not supported by your host system: AMD
CPUs usually support the latest version of AMD-originated features and a
somewhat older version of Intel-originated features, and vice versa.
You can check via /proc/cpuinfo on the host and the VM. For instance,
here's the Sandy Bridge CPU on my laptop:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc
aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer
aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow
vnmi flexpriority ept vpid
If your host CPU is missing any of those features, then emulating a
SandyBridge CPU in your VM is probably an invalid configuration.
S
...
Perhaps. Michael, could you elaborate on your reasoning for believing
that #775235 (reproducible, in a VM, presumably with llvmpipe) is the
same bug as #770130 (on real hardware, seems to be with i915-family
graphics)?
#776911 (on real hardware, appears to be with i915-family graphics)
looks quite a lot like #770130: both involve Intel graphics chips
encountering a "pipe A underrun", and GNOME Shell crashing with
"Cogl-ERROR **: Failed to create texture 2d due to size/format constraints".
S
...
Perhaps. Michael, could you elaborate on your reasoning for believing
that #775235 (reproducible, in a VM, presumably with llvmpipe) is the
same bug as #770130 (on real hardware, seems to be with i915-family
graphics)?
#776911 (on real hardware, appears to be with i915-family graphics)
looks quite a lot like #770130: both involve Intel graphics chips
encountering a "pipe A underrun", and GNOME Shell crashing with
"Cogl-ERROR **: Failed to create texture 2d due to size/format constraints".
S
This is absolutely true of course (I personally use -cpu host), but IMHO there should be more graceful handling of the situation, falling back to something that will almost certainly work. For the sake of completeness, the configurations that fail still fail with the same error as before: LLVM ERROR: Do not know how to split the result of this operator!
That's fine, but I don't think it's release-critical. Could you file a
separate bug for that part, please?
(I also think that's a qemu bug, more than an application bug; it
shouldn't let you emulate a CPU that isn't going to work. I don't think
applications should be expected to cope gracefully with an emulated CPU
that claims it can support instructions but does not actually execute
them correctly, which I think is what's going on in this case.)
S
That's fine, but I don't think it's release-critical. Could you file a
separate bug for that part, please?
(I also think that's a qemu bug, more than an application bug; it
shouldn't let you emulate a CPU that isn't going to work. I don't think
applications should be expected to cope gracefully with an emulated CPU
that claims it can support instructions but does not actually execute
them correctly, which I think is what's going on in this case.)
S
This is absolutely true of course (I personally use -cpu host), but IMHO there should be more graceful handling of the situation, falling back to something that will almost certainly work. For the sake of completeness, the configurations that fail still fail with the same error as before: LLVM ERROR: Do not know how to split the result of this operator!
On Tue, 10 Mar 2015 08:16:03 +0000 (UTC) Simon McVittie <smcv@debian.org> wrote: Agreed, it's definitely not release critical if it works on *real* hardware now. :) I'll gladly file a new bug for this. I'd wholeheartedly +1 this is the failure was only on emulated systems with "impossibly-doomed-to-failure" configurations, but from the earlier posts in this bug this problem initially affected real hardware. Perhaps it's not gnome-shell that should be responsible for handling the problem more gracefuly but something down the line should, be it mesa or something else. (gnome-shell should, however, give a more informative error when things go wrong but that's for another bug report, one which probably already exists). I'm not saying that there isn't also a qemu bug as well. Indeed, if qemu is expected to virtualize a particular configuration it should be able to either do it or fail early with a clear message explaining why it failed. "Your host CPU is missing the necessary features to be able to emulate the CPU-type specified". But as far as *this* bug is concerned, I'm satisfied with the patch to mesa since that fixes the problems that I experienced with valid configurations.
On Tue, 10 Mar 2015 08:16:03 +0000 (UTC) Simon McVittie <smcv@debian.org> wrote: Agreed, it's definitely not release critical if it works on *real* hardware now. :) I'll gladly file a new bug for this. I'd wholeheartedly +1 this is the failure was only on emulated systems with "impossibly-doomed-to-failure" configurations, but from the earlier posts in this bug this problem initially affected real hardware. Perhaps it's not gnome-shell that should be responsible for handling the problem more gracefuly but something down the line should, be it mesa or something else. (gnome-shell should, however, give a more informative error when things go wrong but that's for another bug report, one which probably already exists). I'm not saying that there isn't also a qemu bug as well. Indeed, if qemu is expected to virtualize a particular configuration it should be able to either do it or fail early with a clear message explaining why it failed. "Your host CPU is missing the necessary features to be able to emulate the CPU-type specified". But as far as *this* bug is concerned, I'm satisfied with the patch to mesa since that fixes the problems that I experienced with valid configurations.
Right, that's a real application problem, in gnome-shell or its
libraries. However, as far as I can see, that was on older CPUs (Pentium
III-M) that neither support nor claim to support the instruction-set
extensions used (required?) by llvmpipe, as opposed to a misconfigured
VM that claims to support extensions that it cannot?
S
Right, that's a real application problem, in gnome-shell or its
libraries. However, as far as I can see, that was on older CPUs (Pentium
III-M) that neither support nor claim to support the instruction-set
extensions used (required?) by llvmpipe, as opposed to a misconfigured
VM that claims to support extensions that it cannot?
S
Control: unmerge 775235 Control: retitle 775235 gnome-shell: fails to start on i386 when Mesa was built with llvm-3.5 Control: reassign 775235 libgl1-mesa-dri Control: retitle 770130 gnome-shell: crashes with "Failed to create texture 2d" after "[drm:i8xx_irq_handler] *ERROR* pipe A underrun" I think this should go to Mesa. There are two potential patches, both tested by Bernhard Übelacker, with go back to llvm-3.4: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#15 or apply an unmerged patch: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#32 I can reproduce symptoms similar to these on a slightly newer machine (Pentium IV) and have filed an xserver-xorg-video-intel bug with full logs. These two should maybe be merged with that bug, but I'll leave that to the X maintainers. This is on hardware about a decade old, so I don't think it should realistically be RC: I filed my new bug as "normal". Regards, S
Control: unmerge 775235 Control: retitle 775235 gnome-shell: fails to start on i386 when Mesa was built with llvm-3.5 Control: reassign 775235 libgl1-mesa-dri Control: retitle 770130 gnome-shell: crashes with "Failed to create texture 2d" after "[drm:i8xx_irq_handler] *ERROR* pipe A underrun" I think this should go to Mesa. There are two potential patches, both tested by Bernhard Übelacker, with go back to llvm-3.4: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#15 or apply an unmerged patch: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=775235#32 I can reproduce symptoms similar to these on a slightly newer machine (Pentium IV) and have filed an xserver-xorg-video-intel bug with full logs. These two should maybe be merged with that bug, but I'll leave that to the X maintainers. This is on hardware about a decade old, so I don't think it should realistically be RC: I filed my new bug as "normal". Regards, S
I am able to reproduce a similar crash on the oldest PC I could find
(#780413). I think it's probably the same thing as #770130, but
not #775235.
Rafal, could you please provide some more information so we can
confirm whether your crash is likely to have the same cause?
* please send the output of "reportbug --template xserver-xorg-core"
to the bug address 776911@bugs.debian.org
(it will be long: you can either attach it compressed or uncompressed,
or include it inline)
* if that output mentions "GPU crash dump saved to /sys/class/drm/card0/error"
like mine did, please capture that to a file by running
"cat /sys/class/drm/card0/error > gpu-crash.txt" as root,
and send gpu-crash.txt too (again, it will be long, and you can
either attach it or send it inline)
Thanks,
S
W dniu 13.03.2015 20:56, Simon McVittie pisze: attached. There is no "GPU crash ..." that I can see; and "/sys/class/drm/card0/error" reports "no error state collected". But I've noticed, that the "underrun error" is triggered when I read "/sys/class/drm/card0/card0-VGA-1/status" (an NOT, when I do that with LVDS -- I do use only the laptop screen and don't have VGA socket connected); so in case this help somebody figure it out, I'm attaching grep-ed syslog from time I executed the following commands. ------------------------------------------------------------- $ cat /sys/class/drm/card0/card0-LVDS-1/status connected $ cat /sys/class/drm/card0/card0-VGA-1/status disconnected $ ls -la /sys/class/drm/card0/ total 0 drwxr-xr-x 5 root root 0 mar 15 14:31 . drwxr-xr-x 4 root root 0 mar 15 14:31 .. drwxr-xr-x 3 root root 0 mar 15 14:31 card0-LVDS-1 drwxr-xr-x 3 root root 0 mar 15 14:31 card0-VGA-1 -r--r--r-- 1 root root 4096 mar 15 14:31 dev lrwxrwxrwx 1 root root 0 mar 15 14:31 device -> ../../../0000:00:02.0 -rw------- 1 root root 0 mar 15 14:31 error drwxr-xr-x 2 root root 0 mar 15 14:31 power lrwxrwxrwx 1 root root 0 mar 15 14:31 subsystem -> ../../../../../class/drm -rw-r--r-- 1 root root 4096 mar 15 14:31 uevent $ --------------------------------------------------------- the "[drm:i8xxx..." entry shows up when I "cat ...VGA/status" from Xterm; while the "[drm:i9xx...." entry shows up when I do that in text mode from linux terminal (/dev/tty1). it doesn't show up when I cat the LVDS status.
Control: reassign 770130 xserver-xorg-video-intel
OK, this still looks similar to #770130 (leaving it merged), but not to
my new bug #780413 on similarly old hardware; so unfortunately my
attempts to reproduce this on the oldest hardware I could find have not
been successful.
I'm reassigning this to the Intel X driver in the hope that someone can
make more sense out of it. I'm fairly sure this bug is in some lower
layer than gnome-shell, and given that it's on approximately 10 year old
hardware (Intel Corporation 82852/855GM Integrated Graphics Device) I'm
not sure that it's release-critical either.
S
Actually, this problem does not ocurr only virtual machines and/or older hardware. It also happens in my Core i7 950 with a AMD Radeon R7-260X. But does not happen on my Core i7 Toshiba Z930 which has an integrated Intel HD4xxx graphics card.
Hello, Good morning, We have gone through your samples from a partner and Here is our Order List. Please do bear in mind that we are very much in need of this order, quote your competitive prices. Kindly send the Order confirmation. Your early reply will be much appreciated. Best Regards, Maryanah Erwin. PT FINDORA INTERNUSA Jln Pahlawan 66 Kec. Arjawinangun 45162 CIREBON West-Java INDONESIA tel : +62 231 357334 fax: +62 231 357260 email: marketing@findora.com