#1100003 hx: Syntax highlighting requires building grammar plugins locally

Package:
hx
Source:
hx
Description:
modal CLI text editor Helix
Submitter:
Bastian Venthur
Date:
2026-06-26 05:47:01 UTC
Severity:
normal
#1100003#5
Date:
2025-03-10 07:32:01 UTC
From:
To:
Dear Maintainer,

thank you for packaging Helix. While using it today, I noticed that syntax
highlighting does not work. I tested with Python and Markdown files, none of
them showed any highlighting. In both tests I noticed that the indentation was
off too, so I suspect it has something to do with Treesitter?


Cheers,

Bastian

#1100003#10
Date:
2025-03-10 08:20:48 UTC
From:
To:
Additional Info,

despite the lack of syntax highlighting, hx --health output for both languages
indicates everything is fine:


$ hx --health python
Configured language servers:
  ✘ ruff: 'ruff' not found in $PATH
  ✘ jedi-language-server: 'jedi-language-server' not found in $PATH
  ✓ pylsp: /usr/bin/pylsp
Configured debug adapter: None
Configured formatter: None
Tree-sitter parser: None
Highlight queries: ✓
Textobject queries: ✓
Indent queries: ✓


$ hx --health markdown
Configured language servers:
  ✘ marksman: 'marksman' not found in $PATH
  ✘ markdown-oxide: 'markdown-oxide' not found in $PATH
Configured debug adapter: None
Configured formatter: None
Tree-sitter parser: None
Highlight queries: ✓
Textobject queries: ✘
Indent queries: ✘

#1100003#15
Date:
2025-03-10 09:15:05 UTC
From:
To:
Quoting Bastian Venthur (2025-03-10 08:32:01)

Thanks for trying out Helix :-)

I guess you did not notice the README.Debian about grammar plugins not
included with the package?

Suggestions for how to include them is welcome. Installing them all will
slow down Helix, so perhaps a symlink mechanism like that used for
apache2 extensions, with a debconf frontend, would be sensible too.

 - Jonas

#1100003#20
Date:
2025-03-10 09:41:22 UTC
From:
To:
Hi Jonas,

I did indeed not read the README. Thanks for pointing that out, after running
all the steps, it works as expected.

The problem is, that people will probably expect things like syntax
highlighting and proper indent to work out of the box with a modern editor, as
it does with similar packages like neovim, etc. Also, I feel that I have to
update the grammar files every now and then, I'd rather have that happen
automatically via package updates.

I'm not sure if 100+ MB of package data is really that much of an issue, but
one could probably separate the grammar files into a separate package and make
hx recommend it, so people can install hx without them if they really want. But
before going through that hassle, I'd try to package them directly with hx
first and see if people complain at all. People that install hx have probably
different expectations about feature/package size ratio than vim-tiny users ;)


Thanks for packaging hx!

Bastian

#1100003#25
Date:
2025-03-10 10:59:08 UTC
From:
To:
Quoting Bastian Venthur (2025-03-10 10:41:22)

Good.

Now that that's out of the way, let's reuse this bugreport to track the
underlying issue of unusually needing to build grammar plugins locally.

# Sources for grammar plugins is not in Debian

Either hundreds of source packages need to be introduced to Debian, or
the src:hx package would need to carry hundreds of embedded projects.

I think the best approach is to package the sources for the more popular
grammars only, and think the best approach is that those interested in
some grammar join the tree-sitter team and file a bugreport against hx
when sources are available in Debian.

# Grammar plugins are either system-shared or personal

If I recall correctly (it is some time ago I looked into that), grammar
plugins can be stored below /usr - but if a single plugin exists below
~/.config/helix/runtime/grammars then the system-shared plugins are all
ignored.

Ideally we should convince upstream to improve this, but if not then I
am willing to carry a reasonably small patch for Debian. Help making
such patch is welcome.

# Grammar plugins slows startup of Helix

My main concern with many plugins is not size but speed: As I recall,
I experienced a noticable slowdown in startup of Helix when many plugins
are loaded.


 - Jonas

#1100003#32
Date:
2025-03-11 08:44:00 UTC
From:
To:
Hi Jonas,

thanks for your reply.

Thanks for the explanation, I understand that these tree-sitter plugins
should be packaged separately so they can be used by other packages such
as neovim. Does that mean that the few existing tree-sitter packages for
C, Lua, Markdown, etc. could already be utilized by hx?

Also, would you be open to make hx depend on a reasonable set of
treesitter plugins by default?

This sounds not right, maybe this has been fixed upstream already. If
not, this should probably be fixed.
all tree-siter plugins and did not perceive any performance difference.


Thank you for your patience,

Bastian

#1100003#37
Date:
2025-05-14 03:59:09 UTC
From:
To:
Dear Maintainer(s), the control file should have `Recommends: g++ | clang, git`:
- `hx -g fetch` requires `git`
- `hx -g build` requires any C++14 compiler

See:
- https://github.com/helix-editor/helix/blob/f46222ced3ec093dd281beda8a35660749319616/book/src/building-from-source.md?plain=1#L19-L20
- https://github.com/helix-editor/helix/blob/f46222ced3ec093dd281beda8a35660749319616/helix-loader/src/grammar.rs#L86-L188

#1100003#42
Date:
2025-05-14 05:05:38 UTC
From:
To:
Hi Ricardo,

Quoting Ricardo Fernández Serrata (2025-05-14 05:59:09)

Thanks for reporting this.

The declaration of dependencies is, however, a distinct issue
separate from whether or not grammar plugins should be built locally:

Please file as a separate bugreport, to allow independent tracking.

Thanks,

 - Jonas

#1100003#47
Date:
2025-07-22 10:33:05 UTC
From:
To:
Hello,

I'm one of those who were surprised by syntax highlighting not working
in the default installation, even though hx --health indicates that it
should. (Apologies—I haven’t read Debian’s README either.)

I think that, at the very least, in the default installation, hx
--health shouldn't show check marks for syntax highlighting if it 
doesn't work. I think this can mislead users—especially those who skip
the README.

 > # Sources for grammar plugins is not in Debian
 >
 > Either hundreds of source packages need to be introduced to Debian, or
 > the src:hx package would need to carry hundreds of embedded projects.
 >
 > I think the best approach is to package the sources for the more popular
 > grammars only, and think the best approach is that those interested in
 > some grammar join the tree-sitter team and file a bugreport against hx
 > when sources are available in Debian.

I would also like to see the default installation support some popular
grammars out of the box. Having to install g++ on a server just to
enable syntax highlighting for, say, JavaScript and CSS, doesn't feel
right.  In contrast, Helix package on Fedora and Arch includes many
pre-compiled grammars and is quite ready for use by most users.

 > # Grammar plugins slows startup of Helix
 >
 > My main concern with many plugins is not size but speed: As I recall,
 > I experienced a noticable slowdown in startup of Helix when many plugins
 > are loaded.

I'm not sure if this is still the case. At least, on my Fedora desktop,
the distro's Helix package installs over 200 pre-compiled grammars and
it starts instantly for me.

Best Regards,
Ninjoe

#1100003#52
Date:
2025-07-22 11:17:55 UTC
From:
To:
Hi Nonjoe,

Quoting Ninjoe (2025-07-22 12:33:05)

Thanks for admitting up front that your opinions are provided without
reading the README file.  I appreciate your sharing your opinions
regardless.

How --health option behaves is a different issue than the one tracked in
this bugreport.

Debian packaging does not patch Rust code for the --health argument, so
if you want that health argument to behave differently, then please file
a bugreport upstream about that.

I don't know how other distros handle build-time dependencies, but guess
that they either have network access during build or that additional
dependencies have been packaged for those distributions which are not
packaged in Debian. I.e. exactly my point in the text you quoted, so I
fail to see any relevancy in your note above - please do elaborate if
you think I am missing something helpful there.

Could you perhaps share how beefy or not your machine is, and how big
files you tried to open and how fast they opened?  Then I will try do
the same if/when I get around to loading the hundreds of plugins again.

 - Jonas

#1100003#57
Date:
2025-07-22 12:17:39 UTC
From:
To:
Quoting Ricardo Fernández Serrata (2025-05-14 05:59:09)

Thanks, applied to experimental package now.

Please in future consider filing a separate bugreport for related issues
to not risk it being missed as was about to happen here.

 - Jonas

#1100003#62
Date:
2025-07-22 15:04:18 UTC
From:
To:
Quoting Bastian Venthur (2025-03-11 09:44:00)

There are (as I understand it) no "tree-sitter plugins", but intead
Helix plugins (which are not reusable by other editors, but which
build-depends on libraries often related to tree-sitter project which
might transitively be shared among multiple editors) and language
servers (which in principle are shared among all editors supporting the
Language Server Protocol, LSP).

Helix plugins with all build-dependencies already in Debian are easier
to package but still need some custom setup which has not been fleshed
out yet: When using upstream build routines, the core code for each of
those plugins is fetched from git repos, which is not permitted to do
on Debian build daemons.

I think it is more sensible to recommend plugins than depend on them.

Please feel free to test if possible to use a combination of
system-shared and personally built plugins, and if not initiate a
conversation upstream about introducing such support.

No, I have not found time to verify that yet.

Please consider sharing how beefy your system is, and to measure the
it takes to load Helix both with an enpty document and with a large
document, with and without many plugins enabled.


Kind regards,

 - Jonas

#1100003#67
Date:
2025-09-25 07:45:48 UTC
From:
To:
Package: hx
Version: 25.07.1+~0.3.0+20250717-1
Followup-For: Bug #1100003
X-Debbugs-Cc: venthur@debian.org

I've tried that and can confirm that hx **is** able to use system-shared and
personal grammar plugins.

I've copied some grammars in /usr/lib/hx/runtime/grammars, and some in my
~/.config/helix/runtime/grammars:

$ ls /usr/lib/hx/runtime/grammars/
markdoc.so  markdown_inline.so  markdown.so

$ ls ~/.config/helix/runtime/grammars/
python.so

Then I tried editing a markdown and a python file, both had syntax highlighting
enabled as expected.


Cheers,

Bastian

#1100003#72
Date:
2025-09-25 08:48:25 UTC
From:
To:
Quoting Bastian Venthur (2025-09-25 09:45:48)

That's great. Now I wonder if perhaps it worked all along, and what I
(mis)remember from past experimentation was not bare object files but
system-shared and user-private config.toml and/or languages.toml file.

In any case, good that it will work to ship system-shared grammar
objects. Now is just left the challenge of orchestrating having them
built without network access, including having all needed dependencies
packaged.

 - Jonas

#1100003#77
Date:
2026-04-19 10:29:01 UTC
From:
To:
I guess the repo has gone private. What repo is used and how do I change it to a different one?

» hx --grammar fetch
Fetching 277 grammars
Username for 'https://github.com': ^C

#1100003#82
Date:
2026-04-23 02:22:33 UTC
From:
To:
Hi Jonas and the Debian Rust Team,

I am using Helix in a strictly air-gapped environment and want to share my
perspective.

Currently, the hx package is nearly unusable offline because it lacks
Tree-sitter grammars. Since I cannot access the internet to run hx
--grammar fetch, I am left without syntax highlighting.

Relying on external network access at runtime contradicts the role of a
package manager in restricted environments. It would be a significant
improvement if Debian could provide the grammar sources (and/or
pre-compiled plugins) as part of the package or as local dependencies. This
would allow hx to be functional out-of-the-box, or at least buildable,
without requiring an internet connection.

Best regards,

Junyong Liang

#1100003#87
Date:
2026-04-23 04:49:49 UTC
From:
To:
Quoting Jan Christoph Uhde (2026-04-19 12:29:01)

Try with a .config/helix/languages.toml file like this:

use-grammars = { only = [ "awk", "bash" ]

And then grow that list until it fails to compile.

Then you have the gone repo and can instead use a config like this:

use-grammars = { except = [ "foo" ]

I hope there is a way to instead debug the syntax compiler to reveal
that info more directly, but I haven't spotted it yet.

This is of course unfortunate. As a workaround I have now released a
hew development snapshot that hopefully has working repos again, but
this problem is bound to happen again.

Help packaging the 100s of syntax repos and the 100s of language
servers is much appreciated.

 - Jonas

#1100003#92
Date:
2026-04-23 04:53:13 UTC
From:
To:
Quoting Smart SangGe (2026-04-23 04:22:33)
handling of plugins in helix ideal.

What is needed now is not more arguments that the current situation is
frustrating, but help improving the situation.

Kind regards,

 - Jonas

#1100003#97
Date:
2026-04-25 06:24:34 UTC
From:
To:
Hi Jonas,

I’ve looked into the documentation and source code of both Helix and
Tree-sitter, and I must admit the situation is indeed challenging.

One observation: the existing tree-sitter-*-src packages in stable appear
to be locked to specific version tags rather than individual commits. This
might cause some friction since Helix often tracks very specific (and
newer) snapshots of grammar repositories.

I’m still interested in improving this. If you have a preferred
strategy—whether it’s grouping popular grammars into a single source
package or pushing for more individual grammar packages—please let me know.
I’d be happy to assist with the packaging work to help move this forward.

Best regards,

Junyong Liang

#1100003#102
Date:
2026-04-25 08:34:25 UTC
From:
To:
Quoting Smart SangGe (2026-04-25 08:24:34)

Thanks for your interest in solving this, SangGe. You are very welcome
to join me in the maintenance of helix.

As I see it, there a several challenges here. I have tried to describe
some of that already in earlier posts to this bugreport, so please read
also earlier posts :-)

I would prefer that helix plugins are packaged as separate source
packages independently from helix itself. Reason for that is that I
would prefer that the potential instability for supporting some fringe
language would not affect the stability of helix itself or of support
for other languages. That would probably require the src:hx package to
provide a binary package hx-dev containing needed source files for such
plugin source packages to build from.

It makes sense to me to bundle plugins together. If possible to
identify reasonably reliable, then I think the ideal would be to bundle
source packages by stability and binary packages by runtime relations -
e.g. having src:hx-plugins-core providing hx-plugins-shell (for bash
and dotfiles and other common "core" formats) and hx-plugins-python
(for python3 and python-related formats) and hx-plugins-c (for C and
cpp related classic formats) etc., and src:hx-plugins-extra providing
various sets of less common format plugins.

I don't mind bundling Rust crates - please see the source code for how
I already do that for tree-house. And please do ask, if you cannot
understand how that embedding is established using git-buildpackage and
watch file and copyright file.

Kind regards,

 - Jonas

#1100003#107
Date:
2026-05-12 12:28:25 UTC
From:
To:
Reposting previously private discussion to the bug thread.

Hi Jonas,

I’m working on a hx-highlight-core package covering popular languages like
Rust, Python, and Shell. To ensure compliance with Debian’s offline build
requirements and Helix's specific version needs, I’m using the following
two-phase approach:

1. Source Preparation: A script parses languages.toml for exact commit
hashes, fetches the corresponding Tree-sitter sources, and aggregates them
into a single, reproducible .orig.tar.xz tarball.
2. Build Phase: Using the debian/rules, the build compiles these local
sources into .so files using a standard C compiler and installs them to a
global directory, requiring no network access.

This workflow maintains determinism while fulfilling the requirement for
specific grammar snapshots. Does this approach align with your expectations
and Debian packaging standards?

Best regards,
Junyong Liang

On Tue, Apr 28, 2026 at 4:44 PM Smart SangGe <liangjunyong06@gmail.com> wrote:

#1100003#112
Date:
2026-05-12 12:29:32 UTC
From:
To:
Reposting previously private discussion to the bug thread.

Hi Jonas,

I have tested and verified the syntax highlighting for Rust and Python, and
I can confirm that it works well in my environment.

I propose that we define the scope for the core package first. Based on
common development needs, I suggest including: C, C++, Python, Rust, Shell,
Make, JSON, TOML, and Markdown. These languages cover the majority of my
current development workflow.

I look forward to hearing your opinion on this proposed list.

Best regards,

Junyong Liang

On Thu, Apr 30, 2026 at 1:54 PM Smart SangGe <liangjunyong06@gmail.com> wrote:

#1100003#117
Date:
2026-05-12 12:31:02 UTC
From:
To:
Reposting previously private discussion to the bug thread.

Hi Jonas,

I have prepared the plugin changes and pushed them to my fork.

Would you prefer to take a look there first, or should I go ahead and open
an MR for review?

Best,
Junyong Liang

On Tue, May 12, 2026 at 11:38 AM Smart SangGe <liangjunyong06@gmail.com> wrote:

#1100003#122
Date:
2026-05-12 12:48:01 UTC
From:
To:
[reposted to bugreport]

Quoting Smart SangGe (2026-04-30 07:54:40)
support for a single language would not be bothered too much with
packaged pulled in that are irrelevant for their narrow use case.

I would expect the plugins themselves to not be large and not _depend_
on much else, but it makes sense for the plugin package to _recommend_
relevant LSP daemons, and I expect that to quickly bloat an install.

If e.g. the LSP for Rust pulls in hundreds of MB of recommended
packages, then it makes sense for me that we provide the Python plugin
_without_ lumping it together with the Rust plugin.

Since some of the "bloat" of Rust is CLang, it might however make sense
to lump CLang and Rust plugins together. But again, if Rust-specific
dependencies and recommendations are sizable compared to those for
CLang then it makes sense to split those as well.

Makes sense?

Perhaps start with a total separation: One source package for a set of
popular plugins (where the list you enumerated sounds like a sensible
starting point - we can easily adjust it later), which produces
multiple binary packages, one per language, which themselves are quite
small but recommend potentially heavy stuff. Then if we learn that e.g.
bash and markdown recommendations are both relatively lightweight, we
can consider lumping them together.

If that sounds like too much work for too little gain, then please do
object - after all, you are offering to do the work here, and I
certainly would not want to overload you and have you loose interest in
this :-)

What do you think?

 - Jonas

#1100003#127
Date:
2026-05-12 12:48:23 UTC
From:
To:
[reposted to bugreport]

Quoting Smart SangGe (2026-04-30 12:16:01)

Makes sense.

We can later add metapackages on top of these, if needed. You are right
that there is no need to complicate matters further here.

The core/extra separation makes sense to me, yes.

The versioning makes sense too, but there is one concern (which I don't
think there is an easy answer to and is a general limitation of how we
in Debian stuff multiple upstream packages together): When multiple
upstream packages are lumped together, then we are mre likely to miss
XZ-style security flaws on some of them.

You are quite welcome to go ahead with your proposed plan.

Thanks,

 - Jonas

#1100003#132
Date:
2026-05-12 12:50:35 UTC
From:
To:
Quoting Smart SangGe (2026-05-12 14:31:02)

I don't use MRs. Tell me the URL for your git repo, then I am happy to
look at it.

 - Jonas

#1100003#137
Date:
2026-05-12 15:57:22 UTC
From:
To:
Hi,

Here is my repository:
https://salsa.debian.org/Junyong-Liang/hx

My main contributions are:

   - adding a script to prepare the source package
   - updating debian/copyright

However, I am not fully sure whether some parts violate Debian Policy or
common packaging practices. Any feedback would be appreciated.

Thanks.

#1100003#142
Date:
2026-05-12 18:17:12 UTC
From:
To:
Quoting Junyong Liang (2026-05-12 17:57:22)
and not mention the good stuff, so let me emphasize that no matter what
I write below, I am super happy that you are looking into this, and I
appreciate what you have done already. Even if you stopped now and just
left your draft code behind, it would be a great help.

Here are some quick notes, in random order, as I skimmed throught your
changes:

Please make atomic git commits. Specifically, add code separately from
updating debian/copyright about the added code, and please don't update
debian/copyrigt_hints at all - that is most sensibly done just before a
release together with the final commit changing debian/changelog.

If you suppress lintian warnings, then add a comment explaining your
reasoning for doing so.

You have added a copyright notice for an Erlang grammar to hx package.
That looks wrong - and again it would help to understand the intent if
git commits represent semantically atomic changes: Each commit does
some meaningful change to the package (rather than each commit
representing a history of all-at-once moves). I.e. use rebase to
reshape git commits to each make sense.

The script update-highlight-core seems to contain duplicated data -
e.g. CORE_GRAMMARS in debian/rules (and in debian/core-langs.txt too?),
and the get_version function already implemented in dpkg by doing
"include /usr/share/dpkg/pkg-info.mk" in debian/rules.

The debian/*.install files seem auto-generated, and if so would likely
be more sensibly handled in debian/rules by calling dh_install with
appropriate options.

The script update-highlight-core does a *lot* of git clone operations.
That is more efficiently done using myrepos.

The cloned code is placed in Xhighlight-* dirs. I think it is more
sensible to instead place it somewhere below debian/ - e.g. in
debian/vendor/ - which is also much simpler to handle - no custom
tarballs to generate.

A directory ../.grammar-cache is created *outside* the package root.
That's very naughty, and should *not* be done: It should be expected
that package build routines do not interfere with the outside system!

The script update-highlight-core seems over-engineered. I wonder if
better implemented as one or a few shell scripts, executing those and
other commands directly from debian/rules: Since debian/rules is a make
file, it is by design good good at executing commands, check if they
fail and print how they were called - which it seems the python script
spends a lot of tedious lines implementing.

Hope these comments are helpfull. Don't see them as things that are
"wrong" and must be changed, but more things that I would be
uncomfortable leaving as-is if I were to maintain it - partly because I
am fluent in shell and make but not so fluent in python, where you seem
more fluent in python.

 - Jonas

#1100003#147
Date:
2026-05-13 06:10:17 UTC
From:
To:
Thanks a lot for the detailed review and suggestions — they are very
helpful to me.

I also learned quite a few packaging and maintenance practices from your
comments, including tools like myrepos and some of the usual Debian
packaging workflows that I was not yet familiar with.

I will go through the maintenance/update scripts again and
simplify/reorganize them according to your suggestions. I am also fine with
maintaining this in shell/make instead of Python; I do not want my own
tooling preferences to make future maintenance harder for you or other
maintainers.

I will also rework and rebase the git history so the commits become more
semantically atomic and easier to review.

Thanks again for taking the time to explain all this in detail.

#1100003#152
Date:
2026-05-13 06:58:12 UTC
From:
To:
Quoting Junyong Liang (2026-05-13 08:10:17)

Let me just emphasize again: Your work is valuable too! E.g. your
choice of using python is not inherently a bad choice, I just happen
to have grown up at a time where perl were more popular, so I am more
familiar with that. So please do feel free to have opinions, also when
they contradict with mine. What I meant to say about concerns over
python is that if I were to maintain the code *myself* then I would
rewrite it - but I do hope that I am no longer alone: That you would
want to get on board and maintain the hx package with me.

Feel free to ask questions about packaging. There are many ways to do
it, and many ways to get confused. I am involved in a bunch of
packages, and I might be able to point to other packages implementing
tricks or twists that are relevant here. But again, I want to *suggest*
ways to do things, not dictate them - I have opinions, and I am
interested in learning about your opinions too :-)

 - Jonas

#1100003#157
Date:
2026-05-17 16:05:47 UTC
From:
To:
Hi Jonas,

I’ve made another round of updates based on your suggestions for the
maintenance tooling migration. In particular, I’ve moved the setup to
use myrepos, replaced the update script with a shell-based version, and
split the changes into clearer, easier-to-review commits.

The updated version builds successfully on my machine. Could you please
take another look at my fork when you have a chance?

Thanks again for the guidance.

Best,
Junyong Liang

#1100003#162
Date:
2026-05-18 07:46:37 UTC
From:
To:
Quoting Junyong Liang (2026-05-17 18:05:47)

This one feels much easier to read for me. Thanks for doing that
restructuring!

You still lump multiple independent changes together in each git
commit - as a concrete example, I noticed that in the commit updating
debian/copyright you also corrected some structural bugs in the
existing content. I have now cherry-picked those changes and applied
them to the main branch - crediting you :-)

It seems update-highlight-core use tar to remove files. I would use
`find "$src" ... -delete` for that, but maybe I am missing some subtle
tricks there - if so, I recommend adding a comment hinting at that the
reason for the choice of tooling there. Simplifying there would also
avoid piping into `-exec sh -c` - I am aware that some patterns are
safe, but even then I worry about accidentally making a clumsy edit
later that turns it unsafe.

If I understand correctly, you clone and then drop the .git database.
Assuming that's correctly understood, then a shallow clone should be
adequate - i.e. add `--depth=1` to the clone command.

The .mrconfig file can be more compact by adding a default function:

```
[DEFAULT]
lib=clone () { git clone --quiet --filter=blob:none --depth=1 $1 $2 }

[.work/rust]
checkout = clone https://github.com/tree-sitter/tree-sitter-rust rust
```

You use TAB as field separator. I am a fan of TAB, so I am not gonna
try to dissuade you from doing that, but since many editors have a hard
time even visualizing TAB as a raw character, I think it is best to
limit that: in update-highlight-core you could instead use `IFS=$'\t'`.

The cloned code is more than 1GB, where the code of helix itself is
less than 200MB. I would prefer that this was maintained as a separate
source package, build-depending on hx or if needed on a new hx-devel
providing the subset of sources needed for these new routines to work.

What do you think?

 - Jonas

#1100003#167
Date:
2026-05-18 11:56:44 UTC
From:
To:
Hi Jonas,

Thanks again for the detailed review.

I agree with the earlier points you raised, and I will rework those
parts accordingly.

Regarding the TAB separator: I think |IFS=$'\t'| is bash-specific
syntax. The current shebang is |/bin/sh|, which resolves to |dash| on
Debian systems, so I do not think that form is portable there.

Would something like this make more sense instead?

|tab=$(printf '\t') extract_grammars | while IFS="$tab" read -r set name
repo rev; do ... done |

About the repository size: you are right, and honestly I did not expect
it to grow that large either. After checking more closely, most of the
size comes from generated |parser.c| files.

I did previously experiment with splitting this into an additional
source package, but that introduces a synchronization issue: once hx
itself updates, the highlight data ideally needs to update in lockstep
as well, yet in practice there would inevitably be a delay because the
secondary package depends on the hx update landing first.

Keeping them in the same source package allows synchronized uploads,
which feels more correct to me.

Additionally, upstream Helix developers explicitly describe these
grammars as strongly coupled to the exact Helix revision. In particular,
pascalkuthe wrote here:
https://github.com/helix-editor/helix/discussions/12433

    No we require that the tree sitter grammar matches the exact commit
    so it doesn't make sense to use anything but the exact commit
    specified in the config that the queries were created for (which are
    specific to helix).

    Tree sitter grammars are not stable and not reusable across
    different editors/programs.

Given that, I still tend to view this as a tightly coupled component
rather than a reusable shared asset.

Also, splitting the package would not really solve the total source size
issue itself. Even if users prefer building from source, I think it
would still make sense to split out the extra assets separately if needed.

What do you think about this approach?

Best,
Junyong Liang

#1100003#172
Date:
2026-05-18 12:39:51 UTC
From:
To:
Quoting Junyong Liang (2026-05-18 13:56:44)

Indeed - good catch!

(in fact, I ran shellcheck on that script, then changed the code and
tested that it worked, but forgot to run shellcheck on it again, which
would have revealed my mistake)

Yes, looks good.

Yes, agreed they are tightly coupled. We can ensure that is upheld by
having hx provide a virtual package hx-plugin-abi-$HASH and have the
plugin packages depend on that abi. Examples of doing that, including a
debhelper script to keep the logic of computing the abi at the hx
package, is in src:uwsgi - another example is in src:swi-prolog but I
know less about how that one is implemented (have only been involved in
using it).

I see the above not as a separate point but the same point of strong
coupling.

The reason I prefer separation is not to loosen up the coupling, but to
insulate the core package from problems with the plugin package which
in the worst case could cause it to be kicked out of testing during
freeze. I would prefer having hx without plugins stabilize over having
hx kicked due to problems in packaging plugins.
upstream git commits and sometimes cherry-pick some changes. If at each
new upstream import there is a potentially big dump of undocumented
changes, that disrupts the ability to downstream-curate code that is
upstream-curated.

 - Jonas

#1100003#177
Date:
2026-05-24 14:45:28 UTC
From:
To:
Hi Jonas,


I have rebased the series on top of your upstream/debian/latest, so the
structural debian/copyright fixes you already cherry-picked are no
longer part of my copyright update commit.

I also reworked update-highlight-core along the lines you suggested. The
generated .mrconfig now uses a DEFAULT clone helper with --depth=1, the
read loops no longer contain a literal TAB character, and the tar
pipeline plus find -exec sh -c cleanup has been replaced with find ...
-delete.

I have left the source-package split out of this revision for now, since
that seems like a larger packaging decision and I would prefer not to
preempt your preference there without confirmation.

Best,
Junyong Liang

#1100003#182
Date:
2026-06-04 11:00:07 UTC
From:
To:
Hi Jonas,

Sorry, I only just noticed your reply because it was sent via the BTS
and did not reach my mail client as expected.

I think I understand your point now: the motivation for splitting is not
reducing coupling, but isolating maintenance and release risks from the
core hx package.

Let me take a closer look at the uwsgi-style ABI approach you mentioned
and see how a split package would look in practice.

Best regards,
Junyong



On Mon, 18 May 2026 14:39:51 +0200 Jonas Smedegaard <jonas@jones.dk> wrote:
 > Quoting Junyong Liang (2026-05-18 13:56:44)
 > > Regarding the TAB separator: I think |IFS=$'\t'| is bash-specific
 > > syntax. The current shebang is |/bin/sh|, which resolves to |dash| on
 > > Debian systems, so I do not think that form is portable there.
 >
 > Indeed - good catch!
 >
 > (in fact, I ran shellcheck on that script, then changed the code and
 > tested that it worked, but forgot to run shellcheck on it again, which
 > would have revealed my mistake)
 >
 > > Would something like this make more sense instead?
 > >
 > > |tab=$(printf '\t') extract_grammars | while IFS="$tab" read -r set
name
 > > repo rev; do ... done |
 >
 > Yes, looks good.
 >
 > > I did previously experiment with splitting this into an additional
 > > source package, but that introduces a synchronization issue: once hx
 > > itself updates, the highlight data ideally needs to update in lockstep
 > > as well, yet in practice there would inevitably be a delay because the
 > > secondary package depends on the hx update landing first.
 > >
 > > Keeping them in the same source package allows synchronized uploads,
 > > which feels more correct to me.
 >
 > Yes, agreed they are tightly coupled. We can ensure that is upheld by
 > having hx provide a virtual package hx-plugin-abi-$HASH and have the
 > plugin packages depend on that abi. Examples of doing that, including a
 > debhelper script to keep the logic of computing the abi at the hx
 > package, is in src:uwsgi - another example is in src:swi-prolog but I
 > know less about how that one is implemented (have only been involved in
 > using it).
 >
 > > Additionally, upstream Helix developers explicitly describe these
 > > grammars as strongly coupled to the exact Helix revision. In
particular,
 > > pascalkuthe wrote here:
 > > https://github.com/helix-editor/helix/discussions/12433
 > >
 > > No we require that the tree sitter grammar matches the exact commit
 > > so it doesn't make sense to use anything but the exact commit
 > > specified in the config that the queries were created for (which are
 > > specific to helix).
 > >
 > > Tree sitter grammars are not stable and not reusable across
 > > different editors/programs.
 >
 > I see the above not as a separate point but the same point of strong
 > coupling.
 >
 > The reason I prefer separation is not to loosen up the coupling, but to
 > insulate the core package from problems with the plugin package which
 > in the worst case could cause it to be kicked out of testing during
 > freeze. I would prefer having hx without plugins stabilize over having
 > hx kicked due to problems in packaging plugins.
 >
 > > Given that, I still tend to view this as a tightly coupled component
 > > rather than a reusable shared asset.
 > >

#1100003#187
Date:
2026-06-04 11:05:19 UTC
From:
To:
Quoting Junyong Liang (2026-06-04 13:00:07)

Oh, I am very sorry about that.

Thanks for considering!

 - Jonas

#1100003#192
Date:
2026-06-05 08:19:44 UTC
From:
To:
[ replying via bugreport ]

Quoting Junyong Liang (2026-06-05 08:43:09)

This is definitely an issue independent of the work to package plugins.

Please file a bugreport against hx about it, and if you have a solution
for it then please provide that as a patch.

Then, when fixed, this plugin packaging obviously need to be rebased.

Thanks a lot,

 - Jonas

#1100003#197
Date:
2026-06-09 11:16:58 UTC
From:
To:
Hi Jonas,
I have tested a split prototype locally.
The |hx| source package now builds |hx-plugin-dev|, which contains
|languages.toml|, |runtime/queries|, and the |hx-plugin-abi| helper.
Using that package as an additional local build dependency, a separate
|hx-grammars| source package can successfully build both
|hx-highlight-core| and |hx-highlight-extra| in sbuild.
The package builds themselves are succeeding. The remaining issues are
primarily Lintian warnings in the prototype |hx-grammars| source
package, mostly because |debian/copyright| still needs to be reduced and
normalized to reflect the split source tree.
I have not created a Salsa repository yet, as I do not want to preempt
your preferred maintenance layout. Would you prefer that I create a
proposed |hx-grammars| repository on Salsa for review, or would you
rather I keep the prototype as a branch or patch series for the time being?
Best regards,
Junyong Liang

#1100003#202
Date:
2026-06-19 13:29:23 UTC
From:
To:
Hi Jonas,

I have rebased my work on top of your latest fix and completed the
source package split.

Now, |hx|source provides a new |hx-plugin-dev|package, and
|hx-grammars|source depends on this package to perform source updates.
After the source update step, the build produces two binary packages:
|hx-highlight-core|and |hx-highlight-extra|.

I have tested the packaging on my machine and the build completes
successfully.

I have also pushed the changes to my Salsa repositories:

  * https://salsa.debian.org/Junyong-Liang/hx
    <https://salsa.debian.org/Junyong-Liang/hx>
  * https://salsa.debian.org/Junyong-Liang/hx-grammars
    <https://salsa.debian.org/Junyong-Liang/hx-grammars>

Best regards,
Junyong

#1100003#207
Date:
2026-06-20 06:51:40 UTC
From:
To:
Quoting Junyong Liang (2026-06-19 15:29:23)

I have only looked at the changes to the src:hx this time around - i.e.
not looked at src:hx-grammars at all (not because I don't care, but
because I am finishing a bachelor study - last oral exam on June 29th,
and I try limit how much this exciting distraction steals my attention)

It looks quite good now.

A few remarks, though:

Please rebase to only commit stuff that is actually used at the end.
E.g. currently you introduce a gigantic amount of code and then drop it
again - while that is beneficial to document your thought process, it
is confusing and a waste of disk space for those who need not track
*how* you ended up at the final set of code changes.

The ABI helper script computes a hash every time it is run. Maybe
that's fine, but perhaps it makes better sense to compute it once
during build, since the files involved are statically available at
build time.

You drop lintian overrides, but it seems they are still needed. I have
not actually tested if lintian spews warnings if removed, I just notice
that the mentioned licenses are still listed in the debian/copyrrght
file.

Kind regards,

 - Jonas

#1100003#212
Date:
2026-06-26 05:44:07 UTC
From:
To:
Hi Jonas,

Thank you very much for your review and suggestions.

I have addressed all of the points you mentioned:

  * rebased the branch to keep only the final relevant commits,
  * moved the ABI hash generation to build time,
  * restored the necessary lintian overrides.

Please take another look when you have time.

Good luck with your final oral exam, and I look forward to hearing your
feedback once you're finished.

Best regards,

Junyong Liang