- Package:
- multipath-tools
- Source:
- multipath-tools
- Description:
- maintain multipath block device access
- Submitter:
- Christian Seiler
- Date:
- 2015-01-26 14:39:04 UTC
- Severity:
- important
Dear Maintainer,
tl;dr: systemd + open-iscsi = 90s hang at boot in some cases,
and umountiscsi.sh is not called on shutdown. Attached a
debdiff that fixes that without being too invasive.
Longer explanation: if you have the following configuration:
- Jessie
- systemd as init
- open-iscsi configured to automatically log in to some iSCSI target,
iSCSI disk /dev/sdb is then available
- /etc/fstab containing an entry like
/dev/sdb1 /data ext4 rw,_netdev 0 0
or (when using LVM)
/dev/vg_.../lv_... /data ext4 rw,_netdev 0 0
the system boot will hang for 90s because of systemd's default timeout
when devices are not available.
The reason behind this is that open-iscsi contains the following LSB
headers:
Required-Start: $network $remote_fs
Required-Stop: $network $remote_fs sendsigs
Here, $network maps to network-online.target in systemd, that's fine,
but $remote_fs maps to remote-fs.target in systemd, that is the problem.
This is because
a) systemd treats file systems that couldn't be mounted as hard
failures.
and
b) systemd's logic of mounting all remote filesystems is to mount
all filesystems in /etc/fstab that are marked _netdev (and not
makred noauto)
Therefore, systemd waits for the iSCSI device to appear for 90s before
timing out and proceeding with boot. Only then remote-fs.target is
reached and systemd starts the open-iscsi init script.
That in turn will then make the devices appear. The init script will
then call a "mount -a -O _netdev" and "swapon -a -e" in it's start()
routine, that will then cause the mount points to be activated.
So in the end, the boot is kind-of successful in the sense that
everything kind of works at the end of boot, with the following two caveats:
- there is this needless 90s delay (or whatever other delay the admin
has configured) in waiting on the iSCSI targets
- if I want to use systemd's features to order to order a specific
service after remote-fs.target to make sure that the remove file
systems I have are mounted, maybe because the service needs the
data on them, then this won't work consistently, because the
file systems will only be mounted after open-iscsi is started,
which will then be in parallel to any services I have ordered
after remote-fs.target, for example:
- exporting a subdirectory of an iSCSI filesystem via NFS; if
nfs-kernel-server gets started too early, this might fail
because the directory that is exported doesn't exist
If I modify the init script to remove $remote_fs from it's LSB headers,
then booting works as expected. However, this causes two problems:
1. I assume that $remote_fs is in there because you want to support
NFS-based sepearte /usr. Removing $remote_fs from LSB headers
would break such a configuration under sysvinit, since the
open-iscsi tools wouldn't be able to be called.
However, systemd in Debian currently doesn't really support a
separate /usr that's not mounted from initrd anyway.
2. Shutting down is racy.
Shutting down is racy because you then have the following constellation:
- systemd tracks services' states. And while bug #732793 does not occur
anymore because invoke-rc.d strips the .sh from umountiscsi.sh, the
call to umountiscsi.sh stop doesn't really do anything, because
systemd already thinks it is stopped, since it was never started.
- OTOH, systemd will tear down remote filesystems on its own. But
because open-iscsi is only ordered after network-online.target then,
tearing down the remote filesystesm will be done in parallel (!) to
stopping open-iscsi.
This has the unfortunate effect that it could be the case that the
umount call to the filesystem is made after open-iscsi has been
stopped. This will then cause the kernel to hang trying to umount
the filesystem.
I haven't been able to reproduce this race yet, i.e. I have gotten
lucky so far, in that umount was typically faster on my system than
stopping open-iscsi - BUT I am really not comfortable with having
such a flimsy race in place, especially since umount will sync
stuff to the filesystem and stopping open-iscsi too early could
easily cause severe data loss.
So far for my analysis. How do we proceed from here?
- it is quite clear that you probably don't want to change the sysvinit
logic now, especially so late in the Jessie freeze
- however, this bug w.r.t. systemd should definitely be fixed in my
eyes
Therefore, I suggest that you provide a unit file specifically for
systemd. In order to as minimally invasive as possible (especially this
late in the freeze), the unit file should ideally call the original init
script.
After Jessie one should consider redoing the entire logic for
systemd-based systems, there are a lot more features of sytstemd that
one can leverage to make things work better. But to fix this immediate
bug, the changes I mentioned are sufficient.
I have created a debdiff for a test package that changes the following:
- add systemd unit that just calls the init script but has adjusted
dependencies:
- no more After=remote-fs.target
- new Before=remote-fs-pre.target
- add dh-systemd as build-dep and use dh_systemd in debian/rules
- move #DEBHELPER# around in postinst, to make sure package upgrades
don't break the system (dh_systemd_enable code has to come before
the unit is first started, otherwise weird things occur)
- do the equivalent of umountiscsi.sh start so that systemd will
track that service as 'running' - then at shutdown the open-iscsi
init script will be able to call the stop action of that script
I have now tested this under systemd with Jessie, in two different
configurations of Jessie running systemd:
1. root on normal device, separate iSCSI devices mounted
2. root on iSCSI, boot via PXE
In both cases, iSCSI now seems to work as expected. There are a couple
of caveats though:
- as discussed before, non-initrd-mounted separate /usr on NFS
won't work together with this constellation
- unlikely to work well with systemd anyway, regardless of
iSCSI, and I don't think this is something that could be
fixed without a major redesign of the remote-fs*.target
logic across the board
- irrespective of systemd, while looking at it I noticed that
umountiscsi.sh's logic is incomplete, it doesn't try to umount
filesystems on LVM on top of iSCSI, unless they were marked with
_netdev (it only detects direct devices).
OTOH, this has been the case since at least Squeeze, so it can't
be that critical.
- the current design of using umountiscsi.sh doesn't integrate well
with systemd's dependency logic. I don't think this is a huge issue,
as far as I can see, stuff works as well under systemd with my patch
as under sysvinit (except for the /usr-NFS thing), but I do think
that you could make the whole thing a lot more robust if this is
redesigned a bit - but I don't think that is something that should
go to Jessie.
Hello Christian, Actually, from what I know so far, systemd aggressively backgrounds any processes that is taking time. And only processes that depend on it, are put on hold, again in the background. I think you may be missing something here. I believe devices marked _netdev are always backgrounded. At least in sysvinit. And not having them do so in systemd is highly unlikely. Have you had luck root causing in why there is the 90 sec delay ? I am willing to accept a systemd unit. But it is too late for Jessie right now. If you have the unit ready and tested, for now, we can put it into experimental. I would not want to ship something for Jessie now. Ideally, systemd's logic on handling init scripts should take care of it. It has worked for other sysvinit scripts so far. And introducing the systemd unit now in Jessie is late. Because it wouldn't have had enough test cycles. Can you please elaborate more here ? Or perhaps just file a separate bug report. The current init scripts are designed to support LVM + iSCSI. I agree. We need to switch to systemd. But I haven't had the time to do it, and right now, your patch is too late. :-( This is one reason why I keep telling most Debian (Enterprise) users to at least keep track of testing. Because they usually end up reporting bugs too late in the cycle.
Hello Ritesh,
Well, yes, in principle, but the way dependencies are expressed (both
by
default and in the current Debian packaging of systemd), you can still
have serialization of things. See below.
First, if you look at sysvinit with LSB dependency-based boot (Squeeze,
Wheezy, Jessie w/ sysvinit-core). Debian does use startpar(8) to
parallelize some aspects of sysvinit boot, but there are a couple of
syncronization points. They are defined in /etc/insserv.conf and the
relevant ones are:
$local_fs
$remote_fs
If you look at the configuration, you will see that $remote_fs is
$local_fs and the mountnfs init script.
Also, there's the fact that all rcS scripts will completed before any
rc[2-5] scripts are run (the way inittab + rc are set up), so that's an
additional syncronization point.
So if you have an init script with Requires-Start: $local_fs, it will
be
ordered after all scripts (primarily mountall) that appear for
$local_fs
in /etc/insserv.conf, but (according to insserv logic) as early as
otherwise possible.
Same with Requires-Start: $remote_fs: it will be ordered after
$local_fs
(i.e. after mountall) and also after mountnfs.
So you have the following boot ordering
1. anything in rcS that doesn't require $local_fs
2. $local_fs stuff (i.e. mainly mountall)
3. anything else in rcS that doesn't require $remote_fs
4. $remote_fs stuff (i.e. mainly mountnfs)
5. anything else in rcS
6. anything in rc[2-5]
So if you have Requires-Start: $remote_fs in the open-iscsi init
script,
you have the following situation:
- early boot services (1) are started
- local file systems are mounted (2)
- some other services started (3)
- tries to mount remote file systems (4)
/etc/init.d/mountnfs calls /etc/network/if-up.d/mountnfs
(or waits until networking has called that dynamically once
the network is up, depending on your configuration)
/etc/network/if-up.d/mountnfs effectively does
mount -a -O _netdev
At this point, open-iscsi is NOT started. So mount will fail for
all mount points on iSCSI devices. However, since mountnfs
doesn't
check the exit code of the mount command, it will happily
continue
on and pretend everything is fine.
- services ordered after $remote_fs are started, including open-iscsi
open-iscsi calls mount -a -O _netdev itself, which will try to
mount the remaining filesystems again, then succeeding
So nothing is really 'backgrounded', you are just relying on the fact
that mountnfs doesn't really check any exit codes (and that sysvinit
doesn't care if init scripts that your init scripts depends on were
successful), you just tape over that fact by running mount again.
This in turn means that with sysvinit you have kind of exempted
$remote_fs from being the true synchronization point. This doesn't
really matter that much for sysvinit, because there's a different
syncronization point directly after that (end of rcS execution, start
of
rc[2-5] execution), but for systemd that's a different story (see
below). (But note that this COULD break for an early boot service
ordered after $remote_fs that needs the filesystems, it's just that
Jessie by default doesn't ship one.)
Now let's take systemd. systemd has so-called 'targets' which are also
used as synchronization points at boot. The two sysvinit sync points
are
mapped as follows:
$local_fs -> local-fs.target
$remote_fs -> remote-fs.target
Additionally, systemd knows a couple of more sync points, namely
local-fs-pre.target
remote-fs-pre.target
However, systemd doesn't really have a sync point for early-boot vs.
runlevel services.
The boot sequence with systemd is then as follows (only depicting a
part
of it):
early boot services (e.g. udev)
ordered before local-fs-pre.target
|
v
local-fs-pre.target
|
v
mount local file systems
|
v
local-fs.target
|
v
early boot services ordered after local-fs.target
but before remote-fs-pre.target
|
v
remote-fs-pre.target
|
v
mount remote file systems
|
v
remote-fs.target
|
v
the rest
Within each block, everything is of course parallel (barring other
ordering constraints, of course) - even the filesystems are mounted in
parallel.
And obviously, if something doesn't order against any targets shown
here, they will be started immediately (before or in parallel to
local-fs.target) and the targets in the middle won't wait for their
completion.
On shutdown, the whole thing is done in reverse, with one important
caveat: systemd tracks the state of the system, so it looks at the
dependencies of stuff that's running, so if you start a service
manually
without having it enabled at boot, its dependencies will still work
properly. (sysvinit/LSB tries to do that partially by always creating
stop links, even if the services is not enabled.)
Now you have two problems in this setup:
- same thing as with sysvinit: open-iscsi is ordered after
remote-fs.target, so it won't get started until remote-fs.target is
reached
- however, the crucial difference here is that systemd cares whether
stuff has actually worked or not. It doesn't just call
mount -a -O _netdev and hopes for the best, it tries to wait for
the required devices to appear (because they might not appear
synchronously)
-> unfortunately, since open-iscsi won't start before
remote_fs.target, those devices will never appear while
systemd is waiting for them
-> systemd has a default timeout of 90s for devices showing up
so it will wait for 90s for these devices to show up and then
fail
-> only then will systemd consider remote-fs.target reached
(btw. local-fs.target has a setting
OnFailure=emergency.target, so that when it can't mount a
local file system, the boot doesn't even continue, see
Debian bug #743265 for a discussion on this; fortunately
remote-fs.target doesn't have this setting, so boot does
continue in this case)
-> only then will systemd start open-iscsi
-> that will then mount the filesystems again
(which is actually unnecessary with systemd, because as soon
as the devices appear, it will mount the stuff anyway)
-> hence the 90s delay for waiting on devices that will only
show
up later
You can actually try this easily (if you have an iSCSI target lying
around ;-)): setup a Jessie box, install open-iscsi, configure it
to automatically log in to your target, put an iSCSI filesystem as
_netdev into /etc/fstab and reboot - voilà: 90s delay. It's very
simple to reproduce, and it ALWAYS happens in that constellation.
With rootfs on iSCSI it should also happen if you log in to
additional targets. (Otherwise, rootfs on iSCSI is not affected.)
- on shutdown, things are also messy, since systemd tries to shut
down
stuff much more in parallel than sysvinit does
- open-iscsi is a early-boot ("runlevel S") service, i.e. with
sysvinit those always get stopped after all services of the
current runlevel (e.g. 2) are stopped
- with systemd, it just cares about explicit dependencies, so
it will try to stop open-iscsi as early as possible (since
by default nothing is ordered after it)
-> this has the consequence that stuff that's using remote
filesystems might still be running while open-iscsi is
terminating and it can't unmount them
-> the open-iscsi service will then (try to) logout of the
sessions even though stuff is still active.
-> very, very bad
As I said in the original report, on the test system I've used so far
for Jessie I haven't actually seen this race condition (i.e. shutdown
always worked anyway), since nothing was really using the remote
filesystems on my test box, and it might be the case that it doesn't
always occur, but it will at least some times.
I hope this reply can make it a bit clearer as to where the problem
lies
and why my diagnosis is correct.
Note that I have spent probably 10-12 hours on this problem, first
trying to figure out what the problem was and then trying to come up
with a solution that changes as little as possible (because of the
freeze) and testing that against a lot of different scenarios:
- I only noticed that I needed to move #DEBHELPER# around because of
testing partial upgrades
- I don't use rootfs in iSCSI myself, so I set up a test system to
check that nothing broke (which the first version I wanted to send
did, so I fixed that before reporting this)
- I rebooted test boxes quite a lot to see if there was any trouble.
systemd's logic of handling it won't take care of it, because it's
already kind-of broken on sysvinit, but a lot of specific details in
sysvinit that systemd doesn't emulate quite that way mitigate that.
The changes required to make systemd support this in the same way as
sysvinit would be far more invasive to the current systemd code base as
fixing a couple of dependencies here.
I'm going to explain how systemd currently handles unit files, because
then it becomes clear why the unit file I have provided is not really
experimental at all.
systemd does not support init scripts directly from PID1 anymore (this
was different in very old versions). systemd's PID1 only understands
systemd unit files. Instead, systemd now has a concept called
'generators', which are small programs (sometimes even scripts) that
are run
- at boot
- every time systemd re-reads its configuration
The job of a generator is to read some aspect of the system
configuration (init scripts, /etc/fstab, /etc/crypttab, ...) and
generate native systemd units from that.
If you boot a systemd Jessie system and look in /run/systemd/generator
and /run/systemd/generator.late, you will see the units that were
generated by these generators. Each line in /etc/fstab becomes a .mount
unit, each sysvinit script becomes a .service file.
Of course, the generator responsible for init scripts doesn't magically
convert a sysvinit file completely into a service file (that's not
really possible to do automatically in the general case), but the
service file it generates just contains the necessary metadata.
Additionally, it sets ExecStart=/etc/init.d/$SCRIPT start and
ExecStop=/etc/init.d/$SCRIPT stop in the service file, so that the
original service file is actually called.
For example, if I take /etc/init.d/kbd, the systemd-sysv-generator will
produce the following serviced file in
/run/systemd/generator.late/kbd.service:
-----------------------------------------------------------
# Automatically generated by systemd-sysv-generator
[Unit]
SourcePath=/etc/init.d/kbd
Description=LSB: Prepare console
DefaultDependencies=no
Before=sysinit.target
After=remote-fs.target
[Service]
Type=forking
Restart=no
TimeoutSec=0
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
SysVStartPriority=18
ExecStart=/etc/init.d/kbd start
ExecStop=/etc/init.d/kbd stop
-----------------------------------------------------------
So what did I do in order to produce the service file I've attached in
my original report?
- I took the generate service file for the open-iscsi init script
- I removed the comment about automatic generation
- I removed SourcePath (that's mainly for documentation purposes if
you
run systemctl status)
- I adjusted the After= and Before= dependencies
- I added a [Install] section to make it possible to enable this unit
Here's a diff for comparison (old is generated, new is my modified
version):
-----------------------------------------------------------
diff -u open-iscsi.service /lib/systemd/system/open-iscsi.service
--- open-iscsi.service 2015-01-18 21:12:16.325286854 +0100
+++ /lib/systemd/system/open-iscsi.service 2015-01-19
19:14:53.000000000 +0100
@@ -1,11 +1,8 @@
-# Automatically generated by systemd-sysv-generator
-
[Unit]
-SourcePath=/etc/init.d/open-iscsi
-Description=LSB: Starts and stops the iSCSI initiator services and
logs in to default targets
+Description=iSCSI initiator
DefaultDependencies=no
-Before=sysinit.target shutdown.target
-After=network-online.target remote-fs.target
+Before=sysinit.target shutdown.target remote-fs-pre.target
+After=network-online.target
Wants=network-online.target
Conflicts=shutdown.target
@@ -20,3 +17,6 @@
SysVStartPriority=20
ExecStart=/etc/init.d/open-iscsi start
ExecStop=/etc/init.d/open-iscsi stop
+
+[Install]
+WantedBy=multi-user.target
-----------------------------------------------------------
So it's not like this is really that untested, it's basically the way
systemd handles sysv scripts but just with modified dependencies, to
make sure the unit is started before remote-fs-pre.target and not after
remote-fs.target.
I'll file a separate bug report for this. I don't think it's very
critical, especially it doesn't do anything wrong if everything is in
/etc/fstab (or you manually mounted with -o _netdev).
making it as little invasive as possible. And while open-iscsi is not
completely unusable with systemd, there is enough problems with the way
the current package interacts with systemd due to subtle differences in
the handling of dependencies and failures that I think this should
really be fixed in Jessie.
As I said in the original report:
Regards,
Christian
Hi again, Btw, in case it wasn't clear from my first reply here: systemd actively complains at boot that it's still waiting for some devices to appear during the 90s, and the devices shown are the devices specified in /etc/fstab that are on iSCSI. Regards, Christian
Thanks Christian. I'm building a setup to verify the same. s3nt fr0m a $martph0ne, excuse typ0s
video just to be sure we are both referring to the same problem. http://youtu.be/cwcnk00Hwk0 Next, I'll verify your fix. Hopefully by this weekend I'll get it ready. And then we can ask for an exception from the Release Team. I am also CCing the systemd maintainers to be sure we are on the right path. Dear systemd maintainers: Will appreciate your review on this bug report. As it stands now, it affects Jessie (Not RC).
Hi Ritesh, Am 2015-01-23 09:35, schrieb Ritesh Raj Sarraf: Yes, that's the same problem, that it waits 90s for the devices to appear, which they can't, because iscsi login hasn't happened yet. Thanks! Christian
Christian, The patch does not seem to resolve the problem. Can you please verify the same ? http://youtu.be/q4pOQn3C4q0 Ritesh
Hi, Sorry for top posting but I'm writing this from a phone. I can see what you mean, but that doesn't happen to me. The first part of the delay seems fine, as your system appears to take a while to log in to iSCSI (both bare metal against a hardware RAID and VMs against another VM w/ LIO I have here are much faster btw., at most ~2s here), but after the 'reached remote fs (pre)' it should find the devices and not time out waiting for them Is this on LVM (because of /dev/mapper in the output)? If so, did you configure the VGs in /etc/default/open-iscsi? What does journalctl -xn say after booting? BTW I can give you root access to a couple of VMs (together with access to libvirt to watch them boot if you have virt-manager installed) that demonstrate the problem and my solution. Just send me an email (privately) with your SSH pubkey signed with your GPG key in Debian's keyring. Thanks a lot for taking the time to investigate this so thoroughly! Christian Am 25. Januar 2015 10:32:21 MEZ, schrieb Ritesh Raj Sarraf <rrs@researchut.com>:
My setup too is that of 2 VMs, and the iSCSI target used is LIO. No. This is not using LVM. That was to be done is the next phase. My usual tests include iSCSI + Multipath + LVM. But instead, in this case, I'm using Device Mapper Multipath only, which is a more commonly used target of the Device Mapper framework. Unfortunately, I don't have enough time on weekdays. We can think of doing that next weekend, if time permits then. Meanwhile, if you can root cause it, I'd be willing to squeeze out some time to test. Keep in mind that this is just 2 LUNs mapped. And with just 2 paths each. root The problem will be more severe for users with higher number of LUNs mapped. I've still kept the systemd folks in the loop, hopefully they may be able to shed some light.
Control: clone -1 -2 Control: retitle -2 multipath not automounting iscsi devices listed in fstab I cut out the multipath stack just to see if there is some fix we can push. So yes, your patch works perfect in a non-multipath setup. I'll ask release team for an exception. For multipath, I need to figure out some time to root cause it. But that is beyond the scope of this bug report. Hence, the clone of the bug.
Dear multipath-tools Maintainers, this is bug, which is a clone of #775778, is essentially the same problem as was discussed in #775778: the ordering of the multipath-tools init script after after $remote_fs (see Should-Start line in LSB headers) causes an ordering problem, because systemd will wait for all _netdev filesystems in /etc/fstab before the equivalent of $remote_fs is reached. This means that systemd will wait for the /dev/mapper/... devices related to multipath to appear during boot, but at a point where remote-fs.target (systemd's mapping of $remote_fs) is not yet reached. This will lead to a 90s timeout (systemd's default timeout when waiting for devices), only then will systemd continue booting. Please see the original bug report for details w.r.t. open-iscsi; the same reasoning applies to multipath-tools. The same fix that was implemented for open-iscsi in principle also applies for multipath-tools, i.e. make sure that for systemd systems the unit is ordered before remote-fs-pre.target. I don't use multipath-tools myself, but I'll be able to prepare a patch that fixes this on a minimal level tomorrow, you'll just have to test it yourself. Christian
Thanks Christian. I'll wait for your patch.
Am 26.01.2015 um 08:47 schrieb Ritesh Raj Sarraf:
backend), and I came upon the following issues AFTER I fixed this in the
same way as the open-iscsi package. These issues don't seem to be
related to systemd, but a general problem of the multipath package
(although I didn't test it with sysvinit, so I don't know for sure):
1. open-iscsi init script (which is still called even by the new
systemd service file) does udevadm settle to make sure all device
nodes from logging in to iSCSI have been created, because immediately
after that, it wants to activate LVMs configured on iSCSI.
* On its own, that's not a problem, so if you have bare iSCSI with
or without LVM on top, that works fine.
* But, if you have multipath started and configured, there's
/lib/udev/rules.d/60-multipath-rules with the following entry:
# Coalesce multipath devices before multipathd is running
# (initramfs, early boot)
ACTION=="add|change", SUBSYSTEM=="block",
RUN+="/sbin/multipath -v0 /dev/$name"
The problem here is that multipath -v0 /dev/$name doesn't
complete because multipathd is not started. The problem is that
this rule is not only triggered for the devices first available
at boot, but also for the devices that appear due to iSCSI,
which in this case are even configured. Unfortunately, since
multipathd is not running, this is a new deadlock here.
udev now has a default timeout of 30s, so boot hangs for that
time and after that I get a bunch of log messages about
timeouts.[1]
After that, the system boots fine, udevadm settle completes,
open-iscsi init script continues, and then multipathd is
started, which properly activates the devices, which can then
be mounted.
I don't see anything systemd specific in here, and while I
haven't tried it, I would suspect that the same thing occurs
also with sysvinit.
2. Also, really curious, at shutdown I have the following situation:
multipath-tools does not seem to dismantle (or however that is
called properly) multipath volumes. So now, I have the following
situation:
- due to proper ordering with my fix for the 90s systemd issue,
remote filesystems get unmounted by systemd first, so nothing
is mounted anymore that's on multipath
- /etc/init.d/multipath-tools stop is called
- multipathd exits
- but apparently, /dev/mapper/mp{1,2} (that's how I called my
test devices) still exist
- /etc/init.d/open-iscsi stop is called, that logs out of the
iSCSI session
- later at shutdown, something (I don't know exactly what, since
shutdown is parallel) causes the kernel to try to access all
block devices in the system, making it notice that it can't
really access the multipath devices anymore (which still exit!),
so it complains about it. See [2] for log messages related to
this.
So basically you have two issues:
- 30s delay on boot because udevadm settle (in open-iscsi) waits for
multipath -v0 but that won't complete until multipathd is started,
which won't happen until the open-iscsi script is done (which waits
for udevadm settle) -> timeout
- note that if I comment out the udev rule in question, the
system boots immediately (total boot time only a couple of
seconds, including iSCSI + multipath setup), but obviously
that can't be a complete solution, because you DO want to
pick up multipath devices that were started in early boot
- this appears to be related to or the same as
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=580972
- on shutdown, multipath device mapper devices are not removed and then
something tries to access them in late shutdown phase, when iSCSI is
already gone, which produces weird log messages, which in the default
configuration of Jessie are shown on the screen for a short time
before rebooting (might irritate some people)
- since file systems umount cleanly and open-iscsi does a 'sync'
before logging out of all sessions, I think this is *probably*
only cosmetic
Therefore, my question would be: do you see the same to issues on
sysvinit? If so, I would then attach my patch to fix the
boot/shutdown ordering stuff of multipath-tools just on systemd and then
this bug may be closed, whereas the other stuff is something that I
probably can't really comment on too much because I don't use multipath.
Christian
[1] First:
iscsid[890]: Connection1:0 to [target:
iqn.2003-01.org.linux-iscsi.tkmlx74.x8664:sn.9284f4d8cb0e, portal:
192.168.15.100,3260] through [iface: default] is operational now
exactly 30s later:
systemd-udevd[145]: worker [157]
/devices/platform/host2/session1/target2:0:0/2:0:0:3/block/sdd timeout;
kill it
systemd-udevd[145]: seq 1357
'/devices/platform/host2/session1/target2:0:0/2:0:0:3/block/sdd' killed
systemd-udevd[145]: worker [159]
/devices/platform/host2/session1/target2:0:0/2:0:0:4/block/sde timeout;
kill it
systemd-udevd[145]: seq 1358
'/devices/platform/host2/session1/target2:0:0/2:0:0:4/block/sde' killed
systemd-udevd[145]: worker [157] terminated by signal 9 (Killed)
systemd-udevd[145]: worker [159] terminated by signal 9 (Killed)
(sdd and sde are configured in multipath via their wwid)
[2] First you have the stopping of multipath stuff:
[ 40.593235] systemd[1]: About to execute: /etc/init.d/multipath-tools
stop
[ ... some stuff ...]
[ 40.610542] systemd[1]: Child 3043 (multipath-tools) died
(code=exited, status=0/SUCCESS)
[ ... some stuff ...]
[ 40.647841] systemd[1]: Received SIGCHLD from PID 1081 (multipathd).
[ 40.647867] systemd[1]: Child 1081 (multipathd) died (code=exited,
status=0/SUCCESS)
(so basically, it tells me that the init script was successful)
And later on you've got:
[ 41.628858] device-mapper: multipath: Failing path 8:48.
[ 41.628865] end_request: I/O error, dev dm-5, sector 204672
[ 41.629431] end_request: I/O error, dev dm-5, sector 204784
[ 41.629897] end_request: I/O error, dev dm-5, sector 0
[ 41.630057] end_request: I/O error, dev dm-5, sector 8
[ 41.630466] end_request: I/O error, dev dm-5, sector 0
[ 41.631080] device-mapper: multipath: Failing path 8:64.
[ 41.631084] end_request: I/O error, dev dm-6, sector 204672
[ 41.631525] end_request: I/O error, dev dm-6, sector 204784
[ 41.631843] end_request: I/O error, dev dm-6, sector 0
[ 41.632029] end_request: I/O error, dev dm-6, sector 8
[ 41.632760] end_request: I/O error, dev dm-6, sector 0
[ 41.667776] device-mapper: multipath: Failing path 8:64.
[ 41.667790] Buffer I/O error on device dm-6, logical block 25584
[ 41.668517] device-mapper: multipath: Failing path 8:48.
[ 41.668522] Buffer I/O error on device dm-5, logical block 25584
[ 41.668707] Buffer I/O error on device dm-6, logical block 25584
[ 41.669120] Buffer I/O error on device dm-5, logical block 25584
[ 41.670072] Buffer I/O error on device dm-6, logical block 0
[ 41.670202] Buffer I/O error on device dm-6, logical block 1
[ 41.670322] Buffer I/O error on device dm-6, logical block 2
[ 41.670441] Buffer I/O error on device dm-6, logical block 3
[ 41.671130] Buffer I/O error on device dm-5, logical block 0
[ 41.671258] Buffer I/O error on device dm-5, logical block 1
[ 41.703846] device-mapper: multipath: Failing path 8:64.
[ 41.704643] device-mapper: multipath: Failing path 8:48.
[ 41.740753] device-mapper: multipath: Failing path 8:64.
[ 41.741484] device-mapper: multipath: Failing path 8:48.
[ 41.757937] systemd-udevd[3010]: '/sbin/kpartx -u -p -part /dev/dm-5'
[3427] terminated by signal 15 (Terminated)
The last one ist kind of weird, since that comes from a udev rule in
60-kpartx.rules that AFAICT should be run only if a device appears.