#991185 systemd: 'systemctl start anacron.service' from emergency mode kills emergency shell

Package:
systemd
Source:
systemd
Description:
system and service manager
Submitter:
Zack Weinberg
Date:
2021-07-21 20:15:05 UTC
Severity:
important
#991185#5
Date:
2021-07-16 19:09:34 UTC
From:
To:
Running ‘systemctl start anacron.service’ from the emergency shell
causes the emergency shell and all processes running inside it to be
killed; you get the “You are in emergency mode” banner again and are
re-prompted for the root password.

I don’t know if this is a general problem with starting services from
emergency mode, or if it’s somehow specific to anacron.

In my case what I actually typed was ‘dpkg -a --configure’, in the
middle of a complex repair operation; dpkg ran anacron.postinst, which
tried to start the service, kaboom.  Everything was killed, including
the dpkg process.  Yes, I should have blocked service activation with
policy-rc.d, but that doesn’t excuse getting kicked out of the shell.

Recovery was merely tedious, but I can easily imagine this bug
rendering the computer unbootable or worse, which is why I’m filing it
with severity important.

Can we please have some kind of backstop to ensure that systemd
*never* kills the emergency shell, the rescue shell, or anything
started from them?  Not even if a command logically ought to do so,
e.g. ‘reboot’—in those cases, print a message saying that the reboot
or whatever will happen after you exit the shell, and reminding you
how to cancel it.

#991185#10
Date:
2021-07-20 12:00:31 UTC
From:
To:
Am 16.07.21 um 21:09 schrieb Zack Weinberg:

Can you provide a full journal -alb log after this has happened?

Would be good to know, why you ended up in emergency mode.

Michael

#991185#15
Date:
2021-07-20 12:22:58 UTC
From:
To:
I don't use persistent journals so I will need to reproduce the
problem again, please stay tuned.

I booted directly into emergency mode for low-level maintenance.

zw

#991185#20
Date:
2021-07-21 12:41:17 UTC
From:
To:
As requested, `journalctl -al -b -1` after booting into emergency mode
and typing `systemctl start anacron.service` as the first command in
the emergency shell.  This *hard locked* the machine, actually a worse
outcome than what I got originally.

zw

#991185#25
Date:
2021-07-21 13:46:47 UTC
From:
To:
So,

I looked into this a bit.

What you are seeing is actually expected or rather easy to explain:

When emergency mode is triggered (either by a boot failure or adding
emergency to the kernel command line), emergency.target is started and
emergency.service as a result of it.

sysinit.target has "Conflicts=emergency.service emergency.target"

So, whenever something triggers the start of sysinit.target, it will in
turn stop emergency.{service,target}

Basically, all units aside from early boot services have an explicit
dependency on sysinit.target (see e.g. systemctl show -p Requires -p
After anacron.service)

So, by starting such a service, you also trigger the start of
sysinit.target, which in turn stops emergency.{service,target}

This has been raised a while ago and fix committed
https://github.com/systemd/systemd/pull/6765

Unfortunately, this was reverted, as it caused a dependency loop
https://github.com/systemd/systemd/pull/6904

After that, nothing has happened anymore regarding this issue afaics.


Would you be willing to file an upstream bug report?
Maybe there is a way, to find a solution for this without causing a
regression.


For the time being, you might consider using single/rescue mode. It
doesn't seem to have such a conflicts with sysinit.target.

Regards,
Michael

#991185#30
Date:
2021-07-21 13:48:24 UTC
From:
To:
Am 21.07.21 um 15:46 schrieb Michael Biebl:

The original bug report for this is
https://github.com/systemd/systemd/issues/6509

Maybe we can just reopen that one.

#991185#35
Date:
2021-07-21 17:19:50 UTC
From:
To:
Am 21.07.21 um 15:48 schrieb Michael Biebl:


I just did that.
Marking as forwarded accordingly.

#991185#42
Date:
2021-07-21 20:11:55 UTC
From:
To:
...

Thanks for the explanation.

I see that you reopened https://github.com/systemd/systemd/issues/6509
so I commented there instead of filing a brand new report.

I'll keep that in mind, thanks.  I suppose I can always *stop* the
handful of services started by rescue.target if they're interfering
with repairs, instead of avoiding starting them in the first place.

zw