When starting ‘gnome-session’ as some users, a child ‘bash’ process exits with a segmentation fault: ===== Nov 6 17:25:29 malva wireplumber[16259]: SPA handle 'api.bluez5.enum.dbus' could not be loaded; is it installed? Nov 6 17:25:29 malva wireplumber[16259]: PipeWire's BlueZ SPA missing or broken. Bluetooth not supported. Nov 6 17:25:29 malva org.gnome.Shell.desktop[24306]: The XKEYBOARD keymap compiler (xkbcomp) reports: Nov 6 17:25:29 malva org.gnome.Shell.desktop[24306]: > Warning: Unsupported maximum keycode 708, clipping. Nov 6 17:25:29 malva org.gnome.Shell.desktop[24306]: > X11 cannot support keycodes above 255. Nov 6 17:25:29 malva org.gnome.Shell.desktop[24306]: Errors from xkbcomp are not fatal to the X server Nov 6 17:25:29 malva gsd-media-keys[20189]: Unable to get default source Nov 6 17:25:33 malva wireplumber[1833]: SPA handle 'api.bluez5.enum.dbus' could not be loaded; is it installed? Nov 6 17:25:33 malva wireplumber[1833]: PipeWire's BlueZ SPA missing or broken. Bluetooth not supported. Nov 6 17:25:33 malva gsd-media-keys[20189]: Unable to get default source Nov 6 17:25:33 malva gsd-media-keys[20189]: Unable to get default sink Nov 6 17:25:41 malva kernel: [ 3136.022986] bash[24319]: segfault at 7ffc42b7b310 ip 000055ebc8c809fd sp 00007ffc42b7b2d0 error 6 in bash[55ebc8c74000+bb000] Nov 6 17:25:41 malva kernel: [ 3136.022994] Code: c0 48 8d 9c 24 d0 01 00 00 4c 8d 44 24 40 31 c0 c7 05 7f 36 0f 00 00 00 00 00 4d 89 c7 48 89 dd c7 05 83 36 0f 00 fe ff ff ff <66> 89 44 24 40 c7 44 24 08 00 00 00 00 48 89 1c 24 4c 89 44 24 10 Nov 6 17:25:41 malva gdm3: Gdm: GdmDisplay: Session never registered, failing ===== The kernel reports a segmentation fault in a ‘bash’ process. This causes ‘gdm3’ to exit with “Session never registered, failing”. The user cannot log in to their Gnome session.
Hello Ben,
I am not involved in packaging bash, but tried to collect some
more information about this crash.
From the "Code:" line I could get the location of the crash to
be in file y.tab.c, line 1744.
So it may depend on the parameters given to bash and/or the
environment variables.
If possible you could install systemd-coredump.
Then after such a crash "coredumpctl list" shows maybe
the crashing bash process.
If it is the last "coredumpctl gdb" might be able to
show a backtrace of the crash or even the command line
parameters.
Kind regards,
Bernhard
benutzer@debian:~$ echo -n "find /b ..., ..., 0x" && \
find /b ..., ..., 0xc0, 0x48, 0x8d, 0x9c, 0x24, 0xd0, 0x01, 0x00, 0x00, 0x4c, 0x8d, 0x44, 0x24, 0x40, 0x31, 0xc0, 0xc7, 0x05, 0x7f, 0x36, 0x0f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x4d, 0x89, 0xc7, 0x48, 0x89, 0xdd, 0xc7, 0x05, 0x83, 0x36, 0x0f, 0x00, 0xfe, 0xff, 0xff, 0xff, 0x66, 0x89, 0x44, 0x24, 0x40, 0xc7, 0x44, 0x24, 0x08, 0x00, 0x00, 0x00, 0x00, 0x48, 0x89, 0x1c, 0x24, 0x4c, 0x89, 0x44, 0x24, 0x10
benutzer@debian:~$ gdb -q
(gdb) set width 0
(gdb) set pagination off
(gdb) directory /home/benutzer/source/bash/orig/bash-5.1
Source directories searched: /home/benutzer/source/bash/orig/bash-5.1:$cdir:$cwd
(gdb) file /bin/bash
Reading symbols from /bin/bash...
Reading symbols from /usr/lib/debug/.build-id/2d/26352932c3d0d33a67fb5714921f907053976c.debug...
(gdb) tb main
Temporary breakpoint 1 at 0x2ee90: file .././shell.c, line 368.
(gdb) run
Starting program: /usr/bin/bash
Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe588, env=0x7fffffffe598) at .././shell.c:368
368 {
(gdb) pipe info target | grep -E ".text$"
0x0000555555582e10 - 0x000055555563c381 is .text
(gdb) find /b 0x0000555555582e10, 0x000055555563c381, 0xc0, 0x48, 0x8d, 0x9c, 0x24, 0xd0, 0x01, 0x00, 0x00, 0x4c, 0x8d, 0x44, 0x24, 0x40, 0x31, 0xc0, 0xc7, 0x05, 0x7f, 0x36, 0x0f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x4d, 0x89, 0xc7, 0x48, 0x89, 0xdd, 0xc7, 0x05, 0x83, 0x36, 0x0f, 0x00, 0xfe, 0xff, 0xff, 0xff, 0x66, 0x89, 0x44, 0x24, 0x40, 0xc7, 0x44, 0x24, 0x08, 0x00, 0x00, 0x00, 0x00, 0x48, 0x89, 0x1c, 0x24, 0x4c, 0x89, 0x44, 0x24, 0x10
0x55555558e9d3 <yyparse+51>
1 pattern found.
(gdb) b * (0x55555558e9d3 + 42)
Breakpoint 2 at 0x55555558e9fd: file y.tab.c, line 1744.
(gdb) info b
Num Type Disp Enb Address What
2 breakpoint keep y 0x000055555558e9fd in yyparse at y.tab.c:1744
(gdb) list y.tab.c:1744
1739 `--------------------------------------------------------------------*/
1740 yysetstate:
1741 YYDPRINTF ((stderr, "Entering state %d\n", yystate));
1742 YY_ASSERT (0 <= yystate && yystate < YYNSTATES);
1743 YY_IGNORE_USELESS_CAST_BEGIN
1744 *yyssp = YY_CAST (yy_state_t, yystate); <<<<<<<
1745 YY_IGNORE_USELESS_CAST_END
1746 YY_STACK_PRINT (yyss, yyssp);
1747
1748 if (yyss + yystacksize - 1 <= yyssp)
(gdb)
Thank you, I was hoping for exactly this kind of response to give specific steps to diagnose the fault further. I will try these commands and report more.
With ‘systemd-coredump’ installed, I repeat the behaviour (log in
using GDM). The session crashes, back to GDM; a core dump is created.
When I invoke GDB on the core dump, this is the session:
=====
$ coredumpctl gdb
[…]
PID: 45094 (bash)
UID: 1000 (bignose)
GID: 1000 (bignose)
Signal: 11 (SEGV)
Timestamp: Sat 2021-11-06 23:01:32 AEDT (54s ago)
Command Line: -/bin/bash -c $'/usr/bin/gnome-session -l '
Executable: /usr/bin/bash
Control Group: /user.slice/user-1000.slice/session-83.scope
Unit: session-83.scope
Slice: user-1000.slice
Session: 83
Owner UID: 1000 (bignose)
Boot ID: a69a258cb20b42e9aaf40cb7d53fc49e
Machine ID: cdc4569067f74c53bfb3d99a4b35673a
Hostname: malva
Storage: /var/lib/systemd/coredump/core.bash.1000.a69a258cb20b42e9aaf40cb7d53fc49e.45094.1636200092000000.zst (present)
Disk Size: 3.9M
Message: Process 45094 (bash) of user 1000 dumped core.
Found module linux-vdso.so.1 with build-id: 091a444eee04263f7c695be0b5daf3cfefd69e97
Found module libnss_files.so.2 with build-id: d67972b1c26a08eb13fca9f83004e591d646b4f9
Found module ld-linux-x86-64.so.2 with build-id: 6211a5e83642f3c0cb0b1670ee201d1d9d72e05e
Found module libc.so.6 with build-id: 3a69683d31c430fad5cb0fad190a28b9570d5577
Found module libdl.so.2 with build-id: e3eb1a873134b05c621c37b47d8a7d94ca31ea74
Found module libtinfo.so.6 with build-id: 6cdce89a0f924bb8ccef3f2eaa6635897a81c844
Found module bash with build-id: 2d26352932c3d0d33a67fb5714921f907053976c
Stack trace of thread 45094:
#0 0x00007f50286a577d _int_malloc (libc.so.6 + 0x8977d)
#1 0x00007f50286a6734 __GI___libc_malloc (libc.so.6 + 0x8a734)
#2 0x000055c78021b8b0 xmalloc (bash + 0x958b0)
#3 0x000055c780200df2 begin_unwind_frame (bash + 0x7adf2)
#4 0x000055c7802221ca n/a (bash + 0x9c1ca)
#5 0x000055c780222e02 parse_string (bash + 0x9ce02)
#6 0x000055c7801c446e xparse_dolparen (bash + 0x3e46e)
#7 0x000055c7801ebe28 n/a (bash + 0x65e28)
#8 0x000055c7801f6033 n/a (bash + 0x70033)
#9 0x000055c7801f88bc n/a (bash + 0x728bc)
#10 0x000055c7801f23ae n/a (bash + 0x6c3ae)
#11 0x000055c7801f288c n/a (bash + 0x6c88c)
#12 0x000055c7801fc99a expand_words (bash + 0x7699a)
#13 0x000055c7801cf943 execute_command_internal (bash + 0x49943)
#14 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#15 0x000055c7801d276b n/a (bash + 0x4c76b)
#16 0x000055c7801cde39 execute_command_internal (bash + 0x47e39)
#17 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#18 0x000055c7801d276b n/a (bash + 0x4c76b)
#19 0x000055c7801cde39 execute_command_internal (bash + 0x47e39)
#20 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#21 0x000055c7801d276b n/a (bash + 0x4c76b)
#22 0x000055c7801cde39 execute_command_internal (bash + 0x47e39)
#23 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#24 0x000055c7801d276b n/a (bash + 0x4c76b)
#25 0x000055c7801cde39 execute_command_internal (bash + 0x47e39)
#26 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#27 0x000055c7801d276b n/a (bash + 0x4c76b)
#28 0x000055c7801cde39 execute_command_internal (bash + 0x47e39)
#29 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#30 0x000055c7801d276b n/a (bash + 0x4c76b)
#31 0x000055c7801cde39 execute_command_internal (bash + 0x47e39)
#32 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#33 0x000055c7801ce181 execute_command_internal (bash + 0x48181)
#34 0x000055c780222bd9 parse_and_execute (bash + 0x9cbd9)
#35 0x000055c780221f96 n/a (bash + 0x9bf96)
#36 0x000055c780222165 source_file (bash + 0x9c165)
#37 0x000055c78022d319 source_builtin (bash + 0xa7319)
#38 0x000055c7801cafe4 n/a (bash + 0x44fe4)
#39 0x000055c7801d075b execute_command_internal (bash + 0x4a75b)
#40 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#41 0x000055c7801ce181 execute_command_internal (bash + 0x48181)
#42 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#43 0x000055c7801cef91 execute_command_internal (bash + 0x48f91)
#44 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#45 0x000055c7801d276b n/a (bash + 0x4c76b)
#46 0x000055c7801cde39 execute_command_internal (bash + 0x47e39)
#47 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#48 0x000055c7801ce181 execute_command_internal (bash + 0x48181)
#49 0x000055c780222bd9 parse_and_execute (bash + 0x9cbd9)
#50 0x000055c780221f96 n/a (bash + 0x9bf96)
#51 0x000055c780222165 source_file (bash + 0x9c165)
#52 0x000055c78022d319 source_builtin (bash + 0xa7319)
#53 0x000055c7801cafe4 n/a (bash + 0x44fe4)
#54 0x000055c7801d075b execute_command_internal (bash + 0x4a75b)
#55 0x000055c7801d0bb5 execute_command (bash + 0x4abb5)
#56 0x000055c7801ce181 execute_command_internal (bash + 0x48181)
#57 0x000055c780222bd9 parse_and_execute (bash + 0x9cbd9)
#58 0x000055c780221f96 n/a (bash + 0x9bf96)
#59 0x000055c780222165 source_file (bash + 0x9c165)
#60 0x000055c78022d319 source_builtin (bash + 0xa7319)
#61 0x000055c7801cafe4 n/a (bash + 0x44fe4)
#62 0x000055c7801d075b execute_command_internal (bash + 0x4a75b)
#63 0x000055c780222bd9 parse_and_execute (bash + 0x9cbd9)
GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
[…]
Reading symbols from /usr/bin/bash...
(No debugging symbols found in /usr/bin/bash)
[New LWP 45094]
Core was generated by `-/bin/bash -c /usr/bin/gnome-session -l '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f50286a577d in _int_malloc (av=av@entry=0x7f50287daba0 <main_arena>, bytes=bytes@entry=32) at malloc.c:4148
4148 malloc.c: No such file or directory.
(gdb)
=====
After a lot of narrowing down what in this user's session could cause Bash to segmentation fault, I've found that this makes the difference: * When the user's home directory contains ‘$HOME/.gnomerc’, the Bash segmentation fault occurs. The content of ‘$HOME/.gnomerc’ is: ===== # $HOME/.gnomerc # Roaming profile: User specific configuration for GNOME session. . ~/.profile ===== which simply “sources” the user's shell profile script. This file (and the ‘$HOME/.profile’) has been present for years with the same content, without incident on previous Gnome or Bash versions. * When the user's home directory does not contain ‘$HOME/.gnomerc’, the user login works fine, as it did a month ago. So the Gnome session is (I assume) invoking that script, which in turn sources the ‘$HOME/.profile’ script; and somehow that causes Bash to segfault. This same ‘$HOME/.profile’ script is large and somewhat sensitive; but it should never cause Bash to crash, and never does cause it to crash when logging in outside Gnome.