#1129923 chardet project replaced with LLM output in 7.0.0

#1129923#5
Date:
2026-03-06 03:15:19 UTC
From:
To:
It appears that the original LGPL upstream chardet repository [1] has been wholly replaced by an LLM-generated derivative work labelled with a conflicting, and possibly illegal, MIT license.

If the original source cannot be located and the upstream for Debian changed to a legal source, this package may need to be removed from the Debian archive for licensing / legal reasons.  While I do not claim to fully understand copyright law in this area, I will mention that the LLM used has, in all likelihood, incorporated parts of my own GPL and LGPL source code, available publicly, into the codebase in question, and that this in and of itself may be a copyright violation as I did not license my code under the MIT.  As many other authors may have an equal copyright claim to the LLM training data set, and, therefore, to its output in this instance, this represents an unwarranted legal risk for Debian and more importantly for Debian's user base.

Quite apart from the above concerns, the current code quality is poor, and the current version of chardet fails its own test suites.  It is paramount no upgrade be made to the Debian version until these issues are settled, potentially in relevant court proceedings regarding the associated copyright law.

[1] https://github.com/chardet/chardet
[2] https://github.com/chardet/chardet/issues/331

#1129923#10
Date:
2026-03-06 03:26:55 UTC
From:
To:
Thinking a bit more about this from a Debian perspective, the main problem is that the upstream source host wiped out all history of the project including the source code of all earlier, non-LLM versions.

I would imagine the 5.x packages can easily remain in the Debian archives, but with the effective loss of upstream code at the referenced URL this may be much closer to e.g. an old source tree vanishing from a personal hoster than anything else.  Debian itself will probably have to maintain the 5.x version going forward in some fashion, if the package is important to other archive users.

#1129923#15
Date:
2026-03-06 12:58:27 UTC
From:
To:
char 7.0.0 is not in Debian. So adding a future possible upstream
version, as the package version where the bug was found, will not work
in the BTS.

Well, it's worth discussing, but as a bug there's nothing to do to fix
it.

#1129923#24
Date:
2026-03-06 14:30:13 UTC
From:
To:
Agreed on the procedural aspects.  I wasn't sure where to put the discussion on it, and didn't want an inadvertent pull / upgrade of the now-tainted project causing a potential legal problem in the future.  If there is a better place for the discussion, I can move it there instead.
#1129923#29
Date:
2026-03-08 16:54:13 UTC
From:
To:
causing a potential legal problem in the future.

A version check can be added in debian/rules to avoid inadvertent upgrades,
like:

dpkg --compare-versions $(DEB_VERSION_UPSTREAM) lt 7.0.0

#1129923#34
Date:
2026-03-13 04:16:53 UTC
From:
To:
Control: tags -1 patch

I made a salsa MR for it.

https://salsa.debian.org/python-team/packages/chardet/-/merge_requests/2