#452721 xen-utils-common: "xendomains" does not restore domains in same order as it would start them

Package:
src:xen
Source:
xen
Submitter:
Andy Smith
Date:
2023-07-31 17:15:04 UTC
Severity:
wishlist
Tags:
#452721#5
Date:
2007-11-24 18:37:20 UTC
From:
To:
The "xendomains" init script will start domains according to the order
of config files found in /etc/xen/auto/*.  I use this so that, in the
event of a hard reboot, the more important domains will start first.

Some of these contain essential services like DNS resolvers, slapd and
so on, and since starting xen domains happens in series and can be quite
time consuming, it is rather useful to have the around first before
everything else starts up.

Unfortunately though it seems that when restoring domains from a save
file in /var/lib/xen/save/* the ordering is alphabetical.

It would be great if restoring from savefile could be ordered in the
same way as starting from cold.

#452721#12
Date:
2019-04-02 02:35:08 UTC
From:
To:
found 452721 4.8.5+shim4.10.2+xsa282-1+deb9u11
quit

I'm inclined to suggest #452721 is actually a bit more than merely
wishlist.  The ordering of domain start/restore/stop/save can be
extremely important.  The current behavior of the xendomains init script
is rather simplistic.

I would argue for the use of tagging along the lines of what was
standardized for init scripts.  Domains acting as LDAP/NIS/syslog would
need to start before fileserver domains.  Mailserver domains would start
after nearly all other domains had start.  These would likely be stopped
in /almost/ the reverse order (there could be reasons for
stopping/suspending them in an unrelated order).

Rather more interestingly, one might desire some domains to start in
parallel with some services started by init scripts; yet others to start
near the end of the init process.  Perhaps modify /etc/init.d/xendomains
to be called multiple times during system startup/shutdown?  Maybe there
could be links /etc/rc2.d/S02xendomains:early,
/etc/rc2.d/S03xendomains:middle, and /etc/rc2.d/X04xendomains:late which
start "early", "middle", and "late" domains?

Pretty much it really needs to be redone from the ground up.



Related, but perhaps a distinct issue is what happens when one runs
`/etc/init.d/xendomains reload`.  The current behavior is to pause/stop
all domains and then resume/start all domains.  One use for this is for
when qemu-system-i386 gets security updates.  In such case the desired
behavior is to stop and then start each domain (which results in shorter
downtimes for each domain, even though the whole process takes just as
long).

#452721#35
Date:
2021-09-27 03:07:58 UTC
From:
To:
I'm surprised #452721 is tagged moreinfo since it seems simple, but that
may depend on installation capability.

Note, I am not the original reporter, so I might actually be observing
something distinct.  I doubt this, but I cannot be certain.


Issue is this, a hypervisor machine could have tens or even hundreds of
VMs.  There could be ordering dependencies during startup and shutdown.

Notably there are core services, such as LDAP, DHCP, fileserver and DNS.
Often these need to be up before anything else and they may need to come
up in a particular order.  Most often the LDAP server (which can be a
distinct VM) needs to be up first.  Meanwhile for downtimes, a fileserver
(which can also be a VM) needs to go down last.

During a full downtime when all VMs were fully shut down, this effect
can be achieved by including numbers in the filename.  Say
/etc/xen/auto/0_ldap.cfg, /etc/xen/auto/1_fileserver.cfg,
/etc/xen/auto/9_everything_else.cfg.

If the hypervisor is rebooted and VMs are saved to /var/lib/xen/save;
they will be paused in identifier order, but saved by domain name.  When
scanning /var/lib/xen/save, `xendomains` goes by filename which means VMs
are restored in a distinct (and often problematic) order.


A minimal solution would be for `xendomains` to save VMs in
/var/lib/xen/save <domId>-<name> and then use `sort -n` during restore.

A better approach would be to have a LSB style header specifying
dependencies to flag VMs which should be saved or shutdown late, and VMs
which should be saved or shutdown early.

A ridiculous overkill solution might be to turn the /etc/xen/*.cfg files
into full init scripts.  This could be done by having a script which
understood domain configuration files well enough to identify the
name/UUID and then start/stop the domain as specified by $1.  Use that
script as the interpreter (#! line), then it could find the configuration
via $0.  Then normal init script handling tools could take care of
ordering.

(geeze, that really does actually seem kind of like a semi-workable
solution despite seeming rather crazy at first)

#452721#40
Date:
2021-09-27 17:13:04 UTC
From:
To:
Hi Elliott,

I also do this to control start up order, though I use a prefix of
NNN-.

The main missing functionality from my point of view is not being
able to control the order of save/shutdown. As you say the script
for saving everything or shutting everything down just does a read
of all existing domids and does the action on them one by one in
increasing order.

I think the "auto" directory is a pretty good and simple interface,
so how about using it for save/shutdown as well? So, instead of just
enumerating all running domids, enumerate all files in
/etc/xen/auto/ in REVERSE order, parsing the name of the domain out
of each one and doing the action on that name. When all files have
been exhausted, THEN do the action on any remaining running domains.

This has the advantages of:

- still working even if administrator does not use ordering in
  /etc/xen/auto. Filename format there does not change from what it
  is now, where ordering is already possible but is optional.

- being quite obvious behaviour - save/shutdown order is reverse of
  start order.

That seems like a good minimal improvement, but if one wanted to
explicitly control save/shutdown order then perhaps the next
enhancement could be an /etc/xen/shutdown/ directory with similar
purpose to the "auto" one? i.e.:

1. Enumerate files in "shutdown" directory in reverse order, getting
   name from each and doing shutdown action on it

2. If there were no files there, instead use "auto" directory for
   this purpose

3. Then do shutdown action on every remaining running domain as
   usual

Again this still results in everything getting a shutdown action if
administrator does not want to do any of this.

It's an open question for me whether step 2 (falling back to
enumerating "auto" directory) only happens when "shutdown" directory
is empty or if it should happen all of the time.

If you had a dom0 with 100 domains on it but only wanted to control
the order of a few of them, without fallback you would need to copy
ALL the links from auto to shutdown and then change their ordering
because otherwise this would shut down the ones you specified and
then do all the rest in domid order like it does right now.

WITH fallback, you'd get the few you wanted to control done in the
order you expect and then you'd get the order from "auto", which is
appealing but does mean it's going to try to shut down again some
that are already shut down. If there is a relatively quick "is a
domain by this name still running?" check then maybe that's
workable.

If by this you mean it would be good if the "save all" action picked
the filename from the filename in the "auto" directory, to replicate
that directory's ordering, then I agree.

If however you mean the actual Xen domid of the running domain then
I'm not sure what that would buy us. If I had a domain with a
filename of 010-ldap0.cfg it might get strted first and have domid
1, but then I reboot it and it has domid 99, I wouldn't want it
saved as /var/lib/xen/save/99-ladp0, I'd still want it saved as
/var/lib/xen/save/010-ladp0,

I don't think that we should be proposing to change the config
language of upstream Xen or diverge from how domains are usually
configured with upstream Xen. I think that we can get a lot of
improvement without modifying the format of the config files and
only by changing how the start and shutdown scripts work.

At the moment domain start and shutdown is serial in nature and can
take a long time. I don't know if there is any scope for improving
that in scripts, or whether it's an upstream conversation, either
way not for this bug. But because of the lengthy process I do have
an interest in starting my important domains first and shutting them
down last.

Presently I am handling this by numbering the links in the auto
directory, and using my own script that saves or shuts things down
in the order I want.

I can see how this could be improved but I'm not sure it's worth
spending a large amount of effort on it and/or coming up with
a complicated solution.

I have multiple dom0s so where I have concerns about an essential
service being unavailable I take steps to make that service
redundant and then I don't have to care so much about whether the
domain for that service is shut down 1st or 100th.

While being able to control ordering of shutdown would be NICE, it
seems like this would be catering to the administrator of a single
dom0 that can't make services redundant. This raises the question
of what are such administrators doing about the risk of their one
dom0 host becoming unavailable and all its domains with it?

I also feel that trying to add dependency logic into the
configuration is stepping into territory best left to actual cluster
management software, that says what order things should start/stop
in, how many copies of them need to run, where they can be allowed
to run for redundancy purposes, etc.

Thanks,
Andy

#452721#45
Date:
2021-09-27 21:16:53 UTC
From:
To:
Seems we're running into the same problems, coming up with the same
first-tier workaround and now we all need a common complete solution.

This though requires something which understands the format of those
files, can retrieve name or uuid, and then resolve that to something
suitable for `xl {save|shutdown}`.  Alternatively this requires
`xl {save|shutdown}` to be able to select the target domain based on the
configuration file (documentation reads like this might be halfway
implemented).

Additionally this needs a tool to identify domains which are NOT listed
in /etc/xen/auto/ then do save/shutdown on them first.

This strikes me (note, I am NOT a Debian maintainer) as likely to involve
too much work for too little gain.  For complex setups this won't be
enough, for simple setups this will be overkill.

Minimal meaning very simple to implement, but very limited.

The idea is domains which start later get higher domain Ids.  As long as
crucial domains rarely get restarted, they will tend to keep low domain
Ids.  This fails when a crucial domain gets restarted late due to some
reason, but this might capture enough low-hanging fruit to be worthwhile.

I'm pretty sure #452721 is tagged "upstream" since the `xendomains`
originates from the Xen project.  If a solution is likely to be pushed
back to the Xen project, then nearly anything is on the table.  Just an
issue of how much time is needed.

What I was suggesting was NOT to modify the configuration format.  The
idea was a program could treat the domain configuration as if it was a
script (get it from argv[0]), then simply implement start/stop (roughly
system(`xl create $0`)).

Ultimately I suspect the domain configuration files need to add an
"init_handler" setting for specifying a program to be used for
start/stop.  Then "init_config" setting for configuring that program.

If this is saved in the runtime configuration (`xl list -l`), then
unhandled domains are readily identified by the lack of this
configuration.

I suspect this crowd are the ones Debian should be catering most to.
Large enough to have some fairly complicated needs, but small enough not
to have a full IT department.  There are also a very large number of
people in this category.

True, though a little bit would help many people.

#452721#50
Date:
2021-09-28 09:45:08 UTC
From:
To:
I'm not familiar with the "auto" directory('s functionality), but I _assume_
that it's a directory which contains Xen domain config files which are
automatically started up at boot time (in alphabetical sequence).
The user can choose to start other VMs if (s)he so chooses.

If that's correct, then I find it more logical to do *everything* in reverse.
The VM that was started first, should be saved/shutdown last and IIUC your
proposal would not do that.
What makes the most sense to me is that the last started VM should be saved/
shutdown first, which would be one of the "remaining running domains". Once all
the "remaining" ones have been saved/shutdown, THEN do the auto ones in
reverse order.

Could the domain ID be used for that?
I haven't studied it 'in detail', but they seem sequential.

But my main point is that I think the proposed sequence should be adjusted.

Cheers,
  Diederik

#452721#55
Date:
2021-09-28 11:41:57 UTC
From:
To:
Hi Diederik,

Yes; you typically symlink for example /etc/xen/foo.cfg to
/etc/xen/auto/100-foo.cfg so as to enforce some order of automatic startup.

Currently shutdown just goes in order of running domain id though.
but in reverse order.

A problem here would be excluding the domains that have a specified
order from the initial round of shutdowns, which is why I suggested
doing it in reverse order by the "auto" directory and THEN shutting
down anything that's left as normal, since that way you don't need
to check anything.

As you've pointed out, this does mean that if you had linked say
/etc/xen/auto/010-important.cfg with the intention that it be
started first and shut down last, you would have to also link in
every other domain in its correct order otherwise the not-mentioned
ones would be shut down after 010-important.

However, I feel like people who use the /etc/xen/auto directory do
already link all or the majority of their domains in there - I
certainly do. I don't find it onerous to say that if you want to
specify shutdown order then you must link all of the domains in
/etc/xen/auto not just some of them.

Otherwise, if you wanted to say that all non-mentioned domains must
be shutdown first then I guess you'd have to parse the list of
domain names from the "suto" directory first, then get the list of
running domains from /usr/lib/xen-common/bin/xen-init-list and
exclude one from the other for the initial round of shutdowns.

I don't like it because it only says how recent a domain was
started relative to others, not any intention about start/stop
order. Shut one down manually (or crash) and start it again and it
gets a new domid higher than all existing.

We agree about reverse order, I think we only disagree about when to
shut down domains that don't have a preference set. I am all for
keeping it simple by saying ordering must be set for all domains
otherwise ordering for remaining ones is not defined.

Cheers,
Andy

#452721#60
Date:
2021-09-28 21:39:49 UTC
From:
To:
Hi Andy,

I understood that. My point was about the sequence of auto vs non-auto

That may involve some scripting/programming, but *I* don't consider that a
valid argument to not do it...

... and I think that's wrong.
The use case I'm imagining is some domains are important or even essential for
the working of other domains and that's why you want/need to start them as
soon as possible with (potentially) a dependence between them, therefor you
specify the correct order in the auto directory.

Let's say you have a special storage domain which provides storage for all the
domU domains. Without that domain running you CANNOT start any other domain,
so you start that domain first in the auto directory.
If you start the shutdown/save-all-domains procedure the storage domain MUST
be the last one to be shutdown, because otherwise you'd pulling the storage
under live domains, which likely will make them crash. In any case, they will
not be able to shutdown cleanly or save their current state to disk ...
because the disks are gone.

Indeed. I hope that I explained sufficiently why that is wrong or can even be
catastrophic AFAICT.

I think that that is a (too) dangerous assumption.
You could use (say) 3 domains which provide essential services to all other
domains, but after that every user is free to do whatever (s)he wants.

That would make the scenario I described above unworkable or needlessly
complex, so I don't think that's a good/valid solution.

It is a (really) simple heuristic and likely too simple.
But at first glance it seemed (to me) to actually do the right thing.

Indeed.

I like simple too, but I think this actually makes it complex.

I really agree with the 'upstream' tag as not only should it be fixed/adjusted
there, but it also engages a (much) larger audience who think of scenarios we
likely didn't think about.
And they're certainly much more knowledgeable then I am.

Cheers,
  Diederik

#452721#65
Date:
2021-09-28 22:02:46 UTC
From:
To:
Hi Diederik,

Okay, well I am satisfied by the lesser idea of having to specify
order of all domains but if the implementer of the solution isn't
and decides to implement it so not-specified domains shut down first
then that works for me too so have no objection to it.

The idea of the domid controlling/influencing order of shutdown
would not work for me as to me that is not much different to how we
have things now - domains shut down in increasing order of domid. I
can't control it so I just would continue using my own shutdown
script.

Should we move discussion to xen-users@lists.xen.org then?

Cheers,
Andy

#452721#70
Date:
2021-09-28 23:24:58 UTC
From:
To:
Hi Andy,

It was just an idea that popped in my head. All in all I've likely spend less
then a minute thinking about the domid idea.
Don't spend more on it then you already have ;)

I can make a case for both xen-users and xen-devel.
xen-users:
It could be that a solution already exists. I know that in Qubes (which uses
Xen) has some dependency mechanism in that if you start vmA which depends on
vmB, then it first starts vmB and then vmA. I don't know if that is a Qubes
'extension' or that they simply use available functionality of Xen.

xen-devel:
If needed functionality doesn't yet exist and needs to be built anew, then
xen-devel is the right place to discuss that.

It could be that the best place to start is xen-users which then may/could
'transition' to xen-devel.

Let's hear others first what they think is the best approach.

Cheers,
  Diederik

#452721#75
Date:
2021-09-29 02:23:35 UTC
From:
To:
It is *definitely* too simple to do a good job; however, this has the
advantages of being a significant improvement and simple enough to be in
service quickly.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=452721#35

This isn't an adaquate solution, but is a distinct improvement.

Could be interesting to learn of what solutions are already out there and
what features are must have.  Most existing solutions likely have
problems.  Some may be GPL-incompatible.  Most are likely very limited.

Perhaps.  Question is how much person-time is available for this?

If a great deal of xen-devel person-time can be devoted to this a very
ambitious solution might be viable.  If only a little bit of xen-devel
person-time is available, the approach would need to be very limited.

#452721#80
Date:
2023-07-31 01:39:03 UTC
From:
To:
Even though there hasn't been any discussion recently, bug #452721 is
very much still of major concern to me.

First issue is how to parse domain configuration files.  Reason being a
foo.cfg file might have the configuration 'name = "bar"'.  This would
also let the script retrieve the UUID if that has been set.

Turns out while Python in domain configuration files isn't supportted,
the syntax is still a proper subset of the Python language.  This makes
Python the ideal programming language for a replacement script.  Only
weakness is being able to have full Python syntax in configuration files
might make the task simpler.

Presently I hope to convince the Xen core to allow full Python in domain
configuration files, but no news on that front so far.  This would mean
/etc/default/xendomains would need to change to match Python syntax.


My thinking for adding to domain configuration files would be something
along these lines:

init = {
	'tool': 'xendomains-ng',
	'version': 0,
	'order': 9,
	'startwait': 60,
	'stopaction': 'save',
}

Mainly a Python dictionary holding key values.  Thought being the 'tool'
and 'version' values, is to hope for some form of compatibility if such
scripts were to become common.

My thinking is 'order' would indicate sequence.  Domains with higher
order get started first (same order would nominally allow parallel
start).  If a domain.cfg file didn't define order then its order is 0.

'startwait' would tell the script to wait that long before starting
subsquent domains.

'stopaction' would allow different actions if the machine was to stop.
The 3 options which come to mind are 'stop' (shutdown), 'save' (save to
specified storage location), and 'migrate'.


If full Python doesn't become available, this might take the format:

init = 'tool=xendomains-ng,version=0,order=9,startwait=60,stopaction=save'

Not needing to parse the string though does make one's life simpler.


Other concerns include:
Sometimes you may want to take a distinct action during stop.  Ie if
you're doing restarts for kernel updates, you'll want to override and
have domains reboot.

It may be handier to have distinct options for 'restart'.  Full restarts
can follow proper order, or could simply involve bouncing domains based
on order.  Notably with HVM domains and Qemu updates, you could do:

order 0 down, order 1 down, order 9 down, order 9 up, order 2 up, order 0 up

Or you could do:

order 9 down, order 9 up, order 1 down, order 1 up, order 0 down, order 0 up


I'm basically certain writing a new xendomains script in Python is the
way to go.  Now to get an answer as to whether full Python in domain
configuration files could be reenabled.

#452721#85
Date:
2023-07-31 17:10:34 UTC
From:
To:
domU cfg files has been explicitely removed for various reasons.
This does not prevent you from "source"-ing teh cfg files in your
script(s) if they are proper Python syntax. Or you could simply
parse/regex the values you want.
And as Marek suggested in his answer, you can also put any arbitrary
settings in the comments.

Although ...

The problem with adding this to a domU config file is that it could
cause problems for (live) migrations. The start/stop order is "per
dom0", and may be different on another one.
Imagine two dom0s, one storing the domain files "locally", while the
other uses NFS. Only in the second case the domU should wait for the NFS
server/domain to be available.

To me, the start/stop logic should be in a dom0 config file.

A time-based wait may be useful for when everything goes well, but what
about when there are problems ?
If you want to be sure a domain is up (ie. ready to serve), you would
need to peek at the related "service".
For example, to be sure a DNS domU is up, you would have to try a DNS
request, as a ping or "xl list" would not be enough.
Also, domains in xen/auto are started with a mix of serialization AND
parallelization, as "xl create" returns once the domain has started (ie.
in the Xen point of view, not the user's).

Then, each time you do NOT want to follow the usual action, you'd have
to edit -each- domU cfg file ?

Well, it makes -your- life easier, not the maintainers' one ;)
are imported from other files.
(see for example
https://salsa.debian.org/xen-team/debian-xen/-/blob/master/tools/hotplug/Linux/xendomains.in)

Everything considered, I'm not sure why Xen should provide such
functionnality.
I think custom scripts can handle all the various use cases, don't you
think ?
PS: as mentionned by diederik, the "dependency" logic is already handled
by Qubes since years, and it never made it to Xen (I don't know the
reasons though).

But I agree the shutdown sequence could be adapted to :
1. first shutdown the domains NOT in xen/auto
2. then shutdown the domains in xen/auto, in reverse order

For fine grained start/stop order, maybe having a dom0 config file
handling this could be added, like:

     # START/STOP ORDER
     # domains not in these lists will be started after and stopped
     # before the ones here
     start-order=(list of domU names)
     stop-order=(list of domU names)

But then again, this only ensures "domains" start order, not "services
availability" in said domains.