#944645 systemd: daemon-reload triggers an assert in manager_unref_uid_internal()

Package:
systemd
Source:
systemd
Description:
system and service manager
Submitter:
Martin-Éric Racine
Date:
2024-04-25 17:54:18 UTC
Severity:
important
Tags:
#944645#5
Date:
2019-11-13 09:34:14 UTC
From:
To:
systemd upgrade broke dbus, which in turn prevents normal operation of APT and upgrade of packages that depend upon dbus messages.

******************
$ sudo dpkg --configure --pending
Setting up udev (243-5) ...
Failed to reload daemon: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Failed to retrieve unit state: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
invoke-rc.d: could not determine current runlevel
Failed to reload daemon: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Failed to retrieve unit state: Yhteys aikakatkaistu
Failed to restart udev.service: Yhteys aikakatkaistu
See system logs and 'systemctl status udev.service' for details.
invoke-rc.d: initscript udev, action "restart" failed.
Failed to get properties: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
dpkg: error processing package udev (--configure):
 installed udev package post-installation script subprocess returned error exit status 1
dpkg: dependency problems prevent configuration of network-manager:
 network-manager depends on udev; however:
  Package udev is not configured yet.

dpkg: error processing package network-manager (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 udev
 network-manager

$ systemctl status udev.service
Failed to get properties: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
******************

******************
[  152.040009] systemd-journald[208]: Failed to send WATCHDOG=1 notification message: Connection refused
[  236.841091] systemd-journald[208]: Failed to send WATCHDOG=1 notification message: Transport endpoint is not connected
******************

- -- Package-specific info:

- -- System Information:
Debian Release: bullseye/sid
  APT prefers testing
  APT policy: (1111, 'testing'), (1101, 'stable')
Architecture: i386 (i586)

Kernel: Linux 5.2.0-3-686 (SMP w/1 CPU core)
Locale: LANG=fi_FI.UTF-8, LC_CTYPE=fi_FI.UTF-8 (charmap=UTF-8), LANGUAGE=fi:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages systemd depends on:
ii  adduser          3.118
ii  libacl1          2.2.53-5
ii  libapparmor1     2.13.3-6
ii  libaudit1        1:2.8.5-2
ii  libblkid1        2.34-0.1
ii  libc6            2.29-3
ii  libcap2          1:2.27-1
ii  libcryptsetup12  2:2.2.1-1
ii  libgcrypt20      1.8.5-3
ii  libgnutls30      3.6.10-4
ii  libgpg-error0    1.36-7
ii  libidn2-0        2.2.0-2
ii  libip4tc2        1.8.3-2
ii  libkmod2         26-3
ii  liblz4-1         1.9.2-1
ii  liblzma5         5.2.4-1+b1
ii  libmount1        2.34-0.1
ii  libpam0g         1.3.1-5
ii  libpcre2-8-0     10.32-5+b1
ii  libseccomp2      2.4.1-2
ii  libselinux1      2.9-2+b2
ii  libsystemd0      243-5
ii  mount            2.34-0.1
ii  util-linux       2.34-0.1

Versions of packages systemd recommends:
ii  dbus            1.12.16-2
ii  libpam-systemd  243-5

Versions of packages systemd suggests:
ii  policykit-1        0.105-26
pn  systemd-container  <none>

Versions of packages systemd is related to:
pn  dracut           <none>
ii  initramfs-tools  0.135
ih  udev             243-5

- -- Configuration Files:
/etc/systemd/timesyncd.conf changed:
[Time]
NTP=0.pool.ntp.org 1.pool.ntp.org


- -- no debconf information
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEyJACx3qL7GpObXOQrh+Cd8S017YFAl3Lzg0ACgkQrh+Cd8S0
17Z2qBAAnZTKmEBU6YzG5kENfJ/78iQy7TVVX068QuNff9nG+npzwWhAwIrGqCx5
52WmYlQ9Y8Caq+R4i638n98Bz9KPS4kd2EjdRY71WY3oKCHazisXrLuL/3WndkUl
2zP8hi2a453P1Hd7hwC39B2mGMWAMANIWAP1gPtFxIm/fF8ul6DLrHaoldsk+Tah
RCHGuJQjXVYu1wGvp1vvcWlk+gI8/g0Oi1sgv/7aoyVp7cmntC6O8oONvg3QC/kP
61FLQdKrT0sbGT6Dk176G6gM96rnAt0mbkNAijhYasVOo3cPS3nOMf49ZmzBiZqD
8Jkds6CYi90QrbGB7ai+6uZG+8FNE+zEDsGy/xP/g9gI3mlUr2n7E88+Wsu4sm9Z
ZPsw981qyKfYhrrdznouAl9x8G1wO3+an7VSXx63tm7nEVkcJl5XUgqVsDwuzA0Z
gOc0PKV1QWTnlDRIhFNIMgltRB6SBK1gb1u0Jhl9IUEo6SpCLLkMckRmUhXEtNXp
vvMVLroZnRZ00iSU0HTcnz9l4fZKgds/NrDfaeqys59nhUE5cu7x5A/Z8QDfxUG4
ua5bFMu1Ih7jlRJj4P4L1+wReHtXoxmYoX2Nij9wte1wx+B8sL3Cv2V54smFZO8T
VIFnjS/QwHMNFepScQPqlV1wg9MaaQCtrT2YYmdeev330lWeFHs=
=cQeQ
-----END PGP SIGNATURE-----

#944645#10
Date:
2019-11-13 14:43:58 UTC
From:
To:
Hi

Am 13.11.19 um 10:34 schrieb Martin-Éric Racine:

Is this system upgraded from a fully booted system? chroot? rescue shell?
From which systemd version did you upgrade? How long was the system running?

What's the translation? Unfortunately I don't speak Finish :-)

My guess is that it is not dbus that is broken but that the running
systemd instance does not react to requests (anymore). Could be that it
crashed (a result of that is that it will freeze itself).

Can you provide a dmesg dump and the output of journalctl -alb.

Please mark the time when the upgrade happened.

I suppose your problem should be fixed when you do a forced reboot. But
before you do that, it would be great to gather further information first.

Regards,
Michael

#944645#17
Date:
2019-11-13 15:06:29 UTC
From:
To:
ke 13. marrask. 2019 klo 16.44 Michael Biebl (biebl@debian.org) kirjoitti:

Fully booted system.

242-7 to 243-5

The host had been booted a few minutes earlier to upgrade the packages
to current Testing versions.

It says: connection timed out.

It indeed crashed as soon as it was upgraded.

Attached.

It happened about 10 minutes before I sent the bug report.

/var/log/apt/term.log indicates Log started: 2019-11-13  10:24:05

term.log attached

It wasn't fixed by rebooting. APT still fails to complete the upgrade
for previously failed packages.

Martin-Éric

#944645#22
Date:
2019-11-13 15:09:32 UTC
From:
To:
ke 13. marrask. 2019 klo 17.06 Martin-Éric Racine
(martin-eric.racine@iki.fi) kirjoitti:
core dump files from this morning on hand, if needed.

#944645#27
Date:
2019-11-13 15:27:16 UTC
From:
To:
Am 13.11.19 um 16:09 schrieb Martin-Éric Racine:

...

So systemd reproducibly segfaults if you try to complete the upgrade?

Sure, please attach it to the bug report.
If you manually install systemd-dbgsym, you should get a more useful
coredump.

Regards,
Michael

#944645#32
Date:
2019-11-13 15:37:20 UTC
From:
To:
Am 13.11.19 um 16:06 schrieb Martin-Éric Racine:


The dmesg and journalctl log don't show a crash.
They are apparently after a (forced) reboot.

Could you retrigger the apt update and once you hit the crash, attach
journal/dmesg/coredump

#944645#37
Date:
2019-11-13 15:53:03 UTC
From:
To:
ke 13. marrask. 2019 klo 17.27 Michael Biebl (biebl@debian.org) kirjoitti:

Broadcast message from systemd-journald@geode (Wed 2019-11-13 17:45:00 EET):

systemd[1]: Caught <ABRT>, dumped core as pid 2156.


Message from syslogd@geode at Nov 13 17:45:00 ...
 systemd[1]: Caught <ABRT>, dumped core as pid 2156.

Message from syslogd@geode at Nov 13 17:45:00 ...
 systemd[1]: Caught <ABRT>, dumped core as pid 2156.

Message from syslogd@geode at Nov 13 17:45:00 ...
 systemd[1]: Freezing execution.

Message from syslogd@geode at Nov 13 17:45:00 ...
 systemd[1]: Freezing execution.

Broadcast message from systemd-journald@geode (Wed 2019-11-13 17:45:00 EET):

systemd[1]: Freezing execution.

Failed to reload daemon: Vastapää sulki yhteyden (remote closed conection)
Failed to retrieve unit state: Failed to activate service
'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Failed to restart udev.service: Yhteys aikakatkaistu (connection timed out)
See system logs and 'systemctl status udev.service' for details.
invoke-rc.d: initscript udev, action "restart" failed.
Failed to get properties: Failed to activate service
'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
dpkg: error processing package udev (--configure):
 installed udev package post-installation script subprocess returned
error exit status 1
dpkg: dependency problems prevent configuration of network-manager:
 network-manager depends on udev; however:
  Package udev is not configured yet.

dpkg: error processing package network-manager (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 udev
 network-manager
Virhe: Aikakatkaisu saavutettiin (error: timeout reached)
E: Sub-process /usr/bin/dpkg returned an error code (1)

Attached.

#944645#42
Date:
2019-11-13 17:09:03 UTC
From:
To:
ke 13. marrask. 2019 klo 17.37 Michael Biebl (biebl@debian.org) kirjoitti:

Here's what gdb sees with the core dump from PID 716:

Reading symbols from /bin/systemd...
Reading symbols from
/usr/lib/debug/.build-id/88/f695a88e6d77205e02c54057685cdc8eca97e0.debug...
[New LWP 716]
[New LWP 1]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `/sbin/init noquiet nosplash'.
Program terminated with signal SIGABRT, Aborted.
#0  0xb7f3e851 in __kernel_vsyscall ()
[Current thread is 1 (LWP 716)]
(gdb) bt
#0  0xb7f3e851 in __kernel_vsyscall ()
#1  0xb7d7e1d6 in kill () at ../sysdeps/unix/syscall-template.S:78
#2  0x0053d774 in crash (sig=6) at ../src/core/main.c:212
#3  <signal handler called>
#4  0xb7f3e851 in __kernel_vsyscall ()
#5  0xb7d7dee2 in __libc_signal_restore_set (set=0xbfc8d85c) at
../sysdeps/unix/sysv/linux/internal-signals.h:84
#6  __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:48
#7  0xb7d682bd in __GI_abort () at abort.c:79
#8  0xb7ad57f6 in log_assert_failed_realm (realm=LOG_REALM_SYSTEMD,
text=0x559ba4 "n > 0", file=0x53e338 "src/core/manager.c", line=4282,
    func=0x55b3d8 <__PRETTY_FUNCTION__.21510>
"manager_unref_uid_internal") at ../src/basic/log.c:801
#9  0x004f5144 in manager_unref_uid_internal (m=<optimized out>,
uid_refs=0x22399e0, uid=4294948095, destroy_now=false,
_clean_ipc=0xb7c129f0 <clean_ipc_by_gid>)
    at ../src/core/manager.c:4282
#10 0x004ca8fe in unit_unref_uid_internal (u=u@entry=0x226d0c0,
ref_uid=ref_uid@entry=0x226d2d8, destroy_now=destroy_now@entry=false,
_manager_unref_uid=0x4f5150 <manager_unref_gid>)
    at ../src/core/unit.c:5063
#11 0x004ccdd8 in unit_unref_gid (destroy_now=false, u=0x226d0c0) at
../src/core/unit.c:5166
#12 unit_unref_uid_gid (u=0x226d0c0, destroy_now=false) at
../src/core/unit.c:5166
#13 0x004da200 in unit_free (u=<optimized out>) at ../src/core/unit.c:646
#14 0x00504bd5 in manager_clear_jobs_and_units (m=<optimized out>) at
../src/core/manager.c:1299
#15 manager_clear_jobs_and_units (m=0x2239610) at ../src/core/manager.c:1293
#16 0x0053bbc3 in manager_reload (m=0x2239610) at ../src/core/manager.c:3521
#17 invoke_main_loop (m=0x2239610, saved_rlimit_nofile=<optimized
out>, saved_rlimit_memlock=<optimized out>, ret_reexecute=<optimized
out>, ret_retval=<optimized out>,
    ret_shutdown_verb=<optimized out>, ret_fds=<optimized out>,
ret_switch_root_dir=<optimized out>, ret_switch_root_init=<optimized
out>, ret_error_message=<optimized out>)
    at ../src/core/main.c:1734
#18 0x0048ab3a in main (argc=<optimized out>, argv=<optimized out>) at
../src/core/main.c:2642
(gdb)

Martin-Éric

#944645#47
Date:
2019-11-13 18:52:40 UTC
From:
To:
Thanks a lot for the information so far.

Am 13.11.19 um 16:53 schrieb Martin-Éric Racine:

Could you add a "set -x" to
/var/lib/dpkg/info/udev.postinst to see which command triggers the segfault.
I suspect it's the daemon-reload, if so, it would be interesting if a
simple "systemctl daemon-reload" is sufficient to trigger the crash.

#944645#52
Date:
2019-11-13 19:03:07 UTC
From:
To:
Btw, this reminds me of
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=883877#103

Is this the same host where this is happening (again)?
Is this a Virtualbox guest?

Michael

#944645#57
Date:
2019-11-13 19:43:14 UTC
From:
To:
ke 13. marrask. 2019 klo 20.52 Michael Biebl (biebl@debian.org) kirjoitti:
Setting up udev (243-5) ...
+ update_hwdb
+ systemd-hwdb --usr update
+ addgroup --quiet --system input
+ addgroup --quiet --system kvm
+ addgroup --quiet --system render
+ [ -z 242-7 ]
+ upgrade_fixes configure 242-7
+ dpkg --compare-versions 242-7 lt-nl 239-8
+ chrooted
+ stat -c %d/%i /
+ stat -Lc %d/%i /proc/1/root
+ [ 2049/2 = 2049/2 ]
+ return 1
+ can_start_udevd
+ [ ! -d /sys/class/ ]
+ return 0
+ [ -d /run/systemd/system ]
+ systemctl daemon-reload
Failed to reload daemon: Failed to activate service
'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
+ true
+ invoke-rc.d udev restart
Failed to reload daemon: Failed to activate service
'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Failed to retrieve unit state: Yhteys aikakatkaistu
Failed to restart udev.service: Yhteys aikakatkaistu
See system logs and 'systemctl status udev.service' for details.
invoke-rc.d: initscript udev, action "restart" failed.
Failed to get properties: Yhteys aikakatkaistu
dpkg: error processing package udev (--configure):
 installed udev package post-installation script subprocess returned
error exit status 1
dpkg: dependency problems prevent configuration of network-manager:
 network-manager depends on udev; however:
  Package udev is not configured yet.
dpkg: error processing package network-manager (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 udev
 network-manager

Did not produce a core dump, but then again systemd presumably is dead
already at this point.

#944645#62
Date:
2019-11-13 19:48:32 UTC
From:
To:
ke 13. marrask. 2019 klo 21.03 Michael Biebl (biebl@debian.org) kirjoitti:

Same host. This has never been a Virtualbox guest.

#944645#67
Date:
2019-11-13 19:54:14 UTC
From:
To:
Am 13.11.19 um 20:48 schrieb Martin-Éric Racine:

Ok, I'm asking because in the bug report referenced above you talked
about a /etc/ld.so.conf.d/i386-linux-gnu_GL.conf being the source of a
lot of troubles and afaics this file is shipped by virtualbox-guest-x11.

I assume you have since then purged virtualbox-guest-x11 and fixed the
ld.so.conf.d symlinks?
Last time, this seems to have fixed the crash.

#944645#72
Date:
2019-11-13 21:06:04 UTC
From:
To:
ke 13. marrask. 2019 klo 21.54 Michael Biebl (biebl@debian.org) kirjoitti:

$ dpkg -l | grep x11
ii  dbus-x11                             1.12.16-2
  i386         simple interprocess messaging system (X11 deps)
ii  libva-x11-2:i386                     2.5.0-1
  i386         Video Acceleration (VA) API for Linux -- X11 runtime
ii  libx11-6:i386                        2:1.6.8-1
  i386         X11 client-side library
ii  libx11-data                          2:1.6.8-1
  all          X11 client-side library
ii  libx11-protocol-perl                 0.56-7
  all          Perl module for the X Window System Protocol, version
11
ii  libx11-xcb1:i386                     2:1.6.8-1
  i386         Xlib/XCB interface library
ii  libxkbcommon-x11-0:i386              0.9.1-1
  i386         library to create keymaps with the XKB X11 protocol
ii  plymouth-x11                         0.9.4-1.1+b1
  i386         boot animation, logger and I/O multiplexer - X11
renderer
ii  x11-common                           1:7.7+20
  all          X Window System (X.Org) infrastructure
ii  x11-utils                            7.7+4
  i386         X11 utilities
ii  x11-xkb-utils                        7.7+4
  i386         X11 XKB utilities
ii  x11-xserver-utils                    7.7+8
  i386         X server utilities

Going through all the /etc/alternative symlinks is the first thing
that came to mind too. That apparently wasn't it.

You wanted to know if merely doing 'sudo systemctl daemon-reload'
would also cause systemd to crash. I rebooted and tried that. It did.
Coredumps attached.

#944645#77
Date:
2019-11-13 21:16:02 UTC
From:
To:
Ok, at this point I guess it's best to involve upstream.
Would you mind filing an upstream bug report at
https://github.com/systemd/system/issues
mentionting that a daemon-reload triggers an assert in
manager_unref_uid_internal() and attaching your backtrace.

Thanks,
Michael

#944645#82
Date:
2019-11-13 21:38:59 UTC
From:
To:
ke 13. marrask. 2019 klo 23.16 Michael Biebl (biebl@debian.org) kirjoitti:

https://github.com/systemd/systemd/issues/14026

#944645#91
Date:
2021-02-16 06:28:51 UTC
From:
To:
Since Bullseye already entered the freeze but upstream only got around
reacting to the bug report now, my possibilities for getting crashes
by upgrading something that comes with an init script are slim.

I've compiled systemd packages using this command:

$ DEB_BUILD_OPTIONS="nostrip noopt noudeb" debuild -uc -us

They are versioned 247.3-1.1 just to force an upgrade from my personal repo.

Assuming that the issue really is with daemon-reload, this should have
triggered a crash:

$ for n in $(dpkg -S /etc/init.d/* | cut -f 1 -d: | sort | uniq); do
sudo dpkg-reconfigure $n; done
Job for acpid.socket canceled.
alsa-state.service is a disabled or a static unit not running, not starting it.
Reloading AppArmor profiles
update-rc.d: warning: start and stop actions are no longer supported;
falling back to defaults
update-rc.d: warning: start and stop actions are no longer supported;
falling back to defaults
A reboot is required to replace the running dbus-daemon.
Please reboot the system when convenient.
dbus.service is a disabled or a static unit, not starting it.
dbus.socket is a disabled or a static unit, not starting it.
gdm.service is not active, cannot reload.
invoke-rc.d: initscript gdm3, action "reload" failed.
rescue-ssh.target is a disabled or a static unit, not starting it.
pcscd.service is a disabled or a static unit not running, not starting it.
update-initramfs: deferring update (trigger activated)
Processing triggers for initramfs-tools (0.139) ...
update-initramfs: Generating /boot/initrd.img-5.10.0-3-686
saned.socket is a disabled or a static unit not running, not starting it.
insserv: warning: current stop runlevel(s) (1) of script
`smartmontools' overrides LSB defaults (0 1 6).
fstrim.timer is a disabled or a static unit not running, not starting it.
fstrim.service is a disabled or a static unit not running, not starting it.
uuidd.service is a disabled or a static unit not running, not starting it.
update-rc.d: warning: start and stop actions are no longer supported;
falling back to defaults
Job for watchdog.service canceled.
wd_keepalive.service is a disabled or a static unit not running, not
starting it.
$

No crash.

I welcome your thoughts on how to proceed.

Martin-Éric