- Package:
- emacsen-common
- Source:
- emacsen-common
- Submitter:
- Russ Allbery
- Date:
- 2023-01-21 00:03:09 UTC
- Severity:
- important
The 28.1 version of emacs-lucid fails on startup with a cryptic error message: % emacs Cannot find suitable directory for output in ‘comp-native-load-path’. Running emacs -q allows it to start, but it still reports the same error immediately during startup, and attempting to do someting as basic as M-x set-variable (to try to enable debug-on-error) fails with a similar error message: defalias: Cannot find suitable directory for output in ‘comp-native-load-path’. emacs -q --no-site-file avoids the error on startup, but M-x set-variable still immediately fails with the same error.
Russ Allbery [2022-08-19 11:42:30] wrote:
This error is signaled by `comp-el-to-eln-filename` and it usually
indicates that that function was unable to find a writable directory in
which to put the auto-generated `.eln` native-compiled files.
I don't know why it fails to find a writable directory. Maybe
emacs --debug-init` would help
or
emacs -e '(message "%S" comp-native-load-path)'
might help track down the origin of the problem.
This said, Emacs shouldn't become unusable in such a circumstance, so we
should maybe use a patch along the lines of the one below (not sure if
it wraps the relevant invocation of `comp-el-to-eln-filename`, tho).
Stefan
diff --git a/lisp/emacs-lisp/comp.el b/lisp/emacs-lisp/comp.el
index 304ea8cc6c1..5a599f8a0e8 100644
--- a/lisp/emacs-lisp/comp.el
+++ b/lisp/emacs-lisp/comp.el
@@ -3923,10 +3926,13 @@ comp-run-async-workers
"`comp-files-queue' should be \".el\" files: %s"
source-file)
when (or native-comp-always-compile
- load ; Always compile when the compilation is
- ; commanded for late load.
- (file-newer-than-file-p
- source-file (comp-el-to-eln-filename source-file)))
+ load ; Always compile when the compilation is
+ ; commanded for late load.
+ ;; Skip compilation if `comp-el-to-eln-filename' fails
+ ;; to find a writable directory.
+ (with-demoted-errors "Async compilation :%S"
+ (file-newer-than-file-p
+ source-file (comp-el-to-eln-filename source-file))))
do (let* ((expr `((require 'comp)
(setq comp-async-compilation t)
(setq warning-fill-column most-positive-fixnum)
Russ Allbery [2022-08-19 11:42:30] wrote:
This error is signaled by `comp-el-to-eln-filename` and it usually
indicates that that function was unable to find a writable directory in
which to put the auto-generated `.eln` native-compiled files.
I don't know why it fails to find a writable directory. Maybe
emacs --debug-init` would help
or
emacs -e '(message "%S" comp-native-load-path)'
might help track down the origin of the problem.
This said, Emacs shouldn't become unusable in such a circumstance, so we
should maybe use a patch along the lines of the one below (not sure if
it wraps the relevant invocation of `comp-el-to-eln-filename`, tho).
Stefan
diff --git a/lisp/emacs-lisp/comp.el b/lisp/emacs-lisp/comp.el
index 304ea8cc6c1..5a599f8a0e8 100644
--- a/lisp/emacs-lisp/comp.el
+++ b/lisp/emacs-lisp/comp.el
@@ -3923,10 +3926,13 @@ comp-run-async-workers
"`comp-files-queue' should be \".el\" files: %s"
source-file)
when (or native-comp-always-compile
- load ; Always compile when the compilation is
- ; commanded for late load.
- (file-newer-than-file-p
- source-file (comp-el-to-eln-filename source-file)))
+ load ; Always compile when the compilation is
+ ; commanded for late load.
+ ;; Skip compilation if `comp-el-to-eln-filename' fails
+ ;; to find a writable directory.
+ (with-demoted-errors "Async compilation :%S"
+ (file-newer-than-file-p
+ source-file (comp-el-to-eln-filename source-file))))
do (let* ((expr `((require 'comp)
(setq comp-async-compilation t)
(setq warning-fill-column most-positive-fixnum)
Stefan Monnier <monnier@iro.umontreal.ca> writes: Agree Emacs should stay usable and I think the one you've identified should be the right invocation to be wrapped. Best Regards Andrea
Stefan Monnier <monnier@iro.umontreal.ca> writes: Agree Emacs should stay usable and I think the one you've identified should be the right invocation to be wrapped. Best Regards Andrea
Stefan Monnier <monnier@iro.umontreal.ca> writes: Cannot find suitable directory for output in ‘comp-native-load-path’. % emacs -e '(message "%S" comp-native-load-path)' Cannot find suitable directory for output in ‘comp-native-load-path’. If instead I run: % emacs -q --no-site-file -e '(message "%S" comp-native-load-path)' then Emacs opens a window and starts, but just produces the error message: command-line-1: Symbol’s function definition is void: \(message\ \"%S\"\ comp-native-load-path\) Trying to run the same elisp with M-: in that running Emacs just says: defalias: Cannot find suitable directory for output in ‘comp-native-load-path’. Looks like I may need a build with your patch before I can figure out what's going wrong with that variable, since Emacs seems to be so broken that it doesn't know how to introspect itself. I'm a bit mystified as to why everyone else on Debian isn't seeing this. I would have assumed it must be something in my startup files that is incompatible with the latest release of Emacs, except I thought -q --no-site-file should completely disable loading anything from my local configuration.
Stefan Monnier <monnier@iro.umontreal.ca> writes: Cannot find suitable directory for output in ‘comp-native-load-path’. % emacs -e '(message "%S" comp-native-load-path)' Cannot find suitable directory for output in ‘comp-native-load-path’. If instead I run: % emacs -q --no-site-file -e '(message "%S" comp-native-load-path)' then Emacs opens a window and starts, but just produces the error message: command-line-1: Symbol’s function definition is void: \(message\ \"%S\"\ comp-native-load-path\) Trying to run the same elisp with M-: in that running Emacs just says: defalias: Cannot find suitable directory for output in ‘comp-native-load-path’. Looks like I may need a build with your patch before I can figure out what's going wrong with that variable, since Emacs seems to be so broken that it doesn't know how to introspect itself. I'm a bit mystified as to why everyone else on Debian isn't seeing this. I would have assumed it must be something in my startup files that is incompatible with the latest release of Emacs, except I thought -q --no-site-file should completely disable loading anything from my local configuration.
I had the same issue. Turned out my personal .emacs.d/eln-cache directory and its folders belonged to root: $ ls -la $HOME/.emacs.d/eln-cache total 16 drwxr-xr-x 4 root root 4096 Aug 21 19:32 . drwx------ 4 adam users 4096 Aug 19 19:43 .. drwxr-xr-x 2 root root 4096 Aug 21 19:32 28.1-20961986 drwxr-xr-x 2 root root 4096 Aug 19 19:43 28.1-aa5da5cc chown'ing it to me fixed it. Thanks, Adam
Adam Lackorzynski <adam@os.inf.tu-dresden.de> writes: I think the dates were when I upgraded Emacs. Let me take a guess: are you also old-school and use su from a regular user account when installing new packages? HOME gets overridden by su, but LOGNAME and USER do not, and I suspect something in Emacs is deciding where to write files based on USER, and the installation process of emacs-lucid creates the eln-cache directory for some reason. I'm downgrading the severity of this bug because I suspect the average user using sudo may not run into it (although the maintainer should feel free to raise the severity again if they disagree). If I'm right, the best place to solve the problem may be in the Debian maintainer scripts, overwriting USER (and possibly LOGNAME, not sure if it matters) to some safe value like root while performing the installation. This may also be necessary to do when installing Emacs add-on packges; I'm not sure. I've removed the directory with the wrong ownership and upgraded again with USER and LOGNAME set to root, and now everything works fine. (Well, my laptop gets extremely hot the first time I start the new Emacs, but I assume that's expected for the new compilation system.)
Russ Allbery <rra@debian.org> writes:
Agreed, nice catch, very glad y'all found this. And not sure it'd be
appropriate in your case, and you may well know, but adding the "-"
argument to su may avoid the issue for you.
Possibly, but I'd assumed the installation processs should expect a
"normal" root environment, and if so, "su" without the dash wouldn't
qualify. Otherwise, random user .bashrc changes (that su doesn't reset)
could affect the install.
Yes, I believe that should only happen once per .el file, per user[1],
but it's not cheap.
[1] ...until/unless we decide to ship NATIVE_FULL_AOT packages (all
upstream files precompiled). But if nothing else, that's not ready
for broad use yet.
Rob Browning <rlb@defaultvalue.org> writes: For what it's worth, this is the first package that I recall having trouble with during installation, and I've been using Debian for quite a while. :) That doesn't mean you're wrong -- you've got a good point and I should get in the habit of using -. But I do wonder if there's a bug or at least some surprising behavior here in Emacs. Blindly using USER when HOME is set correctly is pretty unusual (and, for whatever it's worth, contrary to the BaseDir specification, not that Emacs is currently following that). Generally HOME should override USER when used to locate the current user's home directory.
Russ Allbery <rra@debian.org> writes: Oh, I don't know whether I'm right or wrong there either, just describing my defensive behavior :) Ahh, yeah, speaking more generally, I fully expect we may hit and have to (help) fix a relatively large number of edge cases as a result of the addition of native compilation. Lots of new moving parts, and the arrangement in Debian is likely different from what upstream typically tests in important ways. (cf. the other problem right now where installs segfault in some cases if you don't already have emacs-el installed. That's likely an upstream bug of some kind.)
Indeed, yes. I'd follow the lines here that a Debian package should ensure that content generated for global use (which I assume the package positinst is generating) shall be in a location accessible for everyone. Adam
Sorry, that should have been:
emacs --eval '(message "%S" comp-native-load-path)'
(which you can then try with `-q` and friends if needed).
My crystal ball suggests maybe your ~/.emacs.d is marked as read-only?
Stefan
Sorry, that should have been:
emacs --eval '(message "%S" comp-native-load-path)'
(which you can then try with `-q` and friends if needed).
My crystal ball suggests maybe your ~/.emacs.d is marked as read-only?
Stefan
Stefan Monnier <monnier@iro.umontreal.ca> writes: I don't know if this helps, but in addition to the recent issue we've identified where not having emacs-el installed can cause emacs from the current 28 packages to segfault on startup, people just figured out that if you try to install emacs via "su apt install emacs" (note the lack of a "-" argument to su) from a non-root account, emacs can end up failing for that account because the install ends up creating root-only directories (for I think the eln files, etc.) in ~. I'm not sure whether that's something that we'd expect to support (i.e. apt installs via sudo without -i or su without -), but I wanted to mention it since it's been reported (I think) more than once, and in case it indicates something that emacs might want to change (use of USER vs HOME, etc.). No personal position there right now either way. Thanks
Stefan Monnier <monnier@iro.umontreal.ca> writes: I don't know if this helps, but in addition to the recent issue we've identified where not having emacs-el installed can cause emacs from the current 28 packages to segfault on startup, people just figured out that if you try to install emacs via "su apt install emacs" (note the lack of a "-" argument to su) from a non-root account, emacs can end up failing for that account because the install ends up creating root-only directories (for I think the eln files, etc.) in ~. I'm not sure whether that's something that we'd expect to support (i.e. apt installs via sudo without -i or su without -), but I wanted to mention it since it's been reported (I think) more than once, and in case it indicates something that emacs might want to change (use of USER vs HOME, etc.). No personal position there right now either way. Thanks
You might want to report it, at least so Emacs maintainers are aware of
the problem. And also maybe the startup code can then report a more
specific error when it encounters a `~/.emacs.d/eln-cache` that belongs
to some other user.
Personally I think running something like Emacs as root with $HOME
pointing to some other user's home directory is a "pilot error", but
then again I'm one of those who didn't notice the infamous "su" change
because I never use `su` without `-` (because I simply don't
understand what is its intended semantics).
Stefan
You might want to report it, at least so Emacs maintainers are aware of
the problem. And also maybe the startup code can then report a more
specific error when it encounters a `~/.emacs.d/eln-cache` that belongs
to some other user.
Personally I think running something like Emacs as root with $HOME
pointing to some other user's home directory is a "pilot error", but
then again I'm one of those who didn't notice the infamous "su" change
because I never use `su` without `-` (because I simply don't
understand what is its intended semantics).
Stefan
Stefan Monnier <monnier@iro.umontreal.ca> writes: To be clear, $HOME was pointing correctly at /root. Emacs ignored $HOME and instead created $USER/.emacs.d/eln-cache, which was surprising. (su without - still changes $HOME but not $USER.) Or at least I think that's happened, and the correct thing happened when I overrode $USER (and $LOGNAME) and then did the same thing again with otherwise the same environment variables.
Stefan Monnier <monnier@iro.umontreal.ca> writes: To be clear, $HOME was pointing correctly at /root. Emacs ignored $HOME and instead created $USER/.emacs.d/eln-cache, which was surprising. (su without - still changes $HOME but not $USER.) Or at least I think that's happened, and the correct thing happened when I overrode $USER (and $LOGNAME) and then did the same thing again with otherwise the same environment variables.
Oh, yes, that's a behavior we have in Emacs which I also find horrible
(and to add insult to injury it takes extra work to implement it; it's
only done when running as root).
I encourage you to file this as a bug.
Stefan
Oh, yes, that's a behavior we have in Emacs which I also find horrible
(and to add insult to injury it takes extra work to implement it; it's
only done when running as root).
I encourage you to file this as a bug.
Stefan
I think what we want in Debian is no native compilation when installing Emacs. There is really no benefit for most people to caching the eln files in ~root, and there are possible bad effects as demonstrated by this bug. Attached is a proposed patch to emacsen-common. I'll reassign this bug there for now.
I don't think `(setq native-comp-speed -1)` does what we need here:
[...]
options of the compiler. The value −1 means disable
native-compilation: functions and files will be only byte-compiled;
however, the ‘*.eln’ files will still be produced, they will just
contain the compiled code in bytecode form. (This can be achieved
[...]
I'm not sure what's the best way to go about disabling automatic native
compilation, but I think `native-comp-deferred-compilation-deny-list`
might work.
Of course, another way to fix this is to actually fix the original
problem where the wrong directory (i.e. `~$USER/.emacs.d` instead of
`$HOME/.emacs.d`) is chosen. This original problem affects other
circumstances where Emacs will write to the `.emacs.d` directory so it
would be good to fix it more generally.
A "quick fix" is to start Emacs with an explicit
`--init-directory` argument.
It's too late to get this fixed for Emacs-28,2, but this really should
be reported as a bug.
Stefan
Stefan Monnier <monnier@iro.umontreal.ca> writes: I think we probably want both, at least in Debian. Even if that is fixed, emacs would then write into some user's (possibly root's) home directory during package installation. That seems like bug to me. d