#885698 Update and document criteria for inclusion in /usr/share/common-licenses

#885698#5
Date:
2017-12-29 09:49:44 UTC
From:
To:
Package: debian-policy
Severity: important
X-debbugs-cc: apo@debian.org
Control: block 795402 by -1
Control: block 883966 by -1
Control: block 884223 by -1
Control: block 884226 by -1
Control: block 884227 by -1
Control: block 884228 by -1
User: debian-policy@packages.debian.org
Usertags: normative discussion

Hello debian-policy@l.d.o,

Our current criteria for including licenses, as Markus Koschany smartly
puts it in #884228, is "[a]pparently something between gut feeling and
the popularity of our least preferred license in common-licenses."  We
can and should do better than this.

In the air is also the idea that we include licenses in common-licenses
to save disk space on low disk space systems: the license should be
popular enough such that the reduced size of d/copyright files will
outweigh the increased size of base-files.

We should write down our criteria in Policy, in section 12.5 (or
possibly in the Policy Changes Process appendix).  We should probably
say too that the application of the criteria is at the discretion of the
Policy Editors.  Before we can do that, however, we need to consider
whether the criteria need to be updated.

The only point of clear consensus -- at least among the Policy Editors
-- is that short licenses which have more than one popular variant
should never be included because of the risk that packages licensed
under one variant incorrectly refer to a different variant in
common-licenses.  This problem actually exists in the archive because a
BSD variant was included in common-licenses at some point.  We should
include this point the Policy Manual.

Otherwise, here are some of the arguments on the table:

(1) In a related d-devel thread, someone working with embedded systems
suggested that these days, either a system has enough disk space that
common-licenses isn't relevant, or it has so little disk space that all
of /usr/share/doc must be deleted.  If this is right, disk space
concerns should not decide what goes into common-licenses.  Is it right?

(2) Some people want more licenses in common-licenses because they find
it more convenient.  Convenient processes save our volunteers' time.  We
frequently get requests to expand common-licenses and I suspect that
many of them are motivated by the belief that it would make the
requestor's work more convenient.  If disk space issues aren't relevant
anymore, an increase in convenience might become a dominating criterion
for inclusion.  However, this point has been disputed: better tools
could provide license text formatted suitably for d/copyright, which
would be just as convenient (e.g., in Emacs: `C-u M-!
get-formatted-license GPL-3` would be about as convenient as it gets).
And there surely exist those who find common-licenses makes editing
d/copyright less convenient...

I'm not sure how to proceed.  It would be nice to verify (1) with other
people working with embedded systems.  Possibly we should ask on one of
our more specialised mailing lists.  And there are surely other
arguments besides (1) and (2).  We should gather those in this bug.

#884228 has further points of discussion, but I'd ask that we restrict
ourselves in this bug to discussing what the criteria for inclusion
should be.  In particular, let's not discuss the proposal to add all
known DFSG-free licenses to common-licenses.  Whether that proposal is
valid depends on our criteria for inclusion, so let's stick to hashing
our those criteria in this bug.

#885698#22
Date:
2018-10-18 03:03:01 UTC
From:
To:
Blocking my own bug report with this one, which I just noticed.

I submitted bug #910548 previously against the base-files package:
"base-files - please consider adding
/usr/share/common-licenses/Unicode-Data".

I had formatted the copyright and license information for Unicode data
files from the http://unicode.org website to put in the
debian/copyright file in a package that I created this summer.  The
copyright information is more involved than most copyright statements,
so I kept it in what I submitted with the bug report.

I thought if that license was something Debian found useful, there
would be no need for anyone else to duplicate the effort of formatting
that I had gone through once, and so I offered it.  Just the license
in isolation could be formatted like other licenses fairly quickly if
the copyright section is not wanted.  Or the whole thing can be left
out and that bug closed, as you wish.

Thanks,


Paul Hardy

#885698#37
Date:
2023-05-12 14:36:05 UTC
From:
To:
Good morning,

 Attached please find your PDF account statement and invoice as of 05/11/2023. Please notice you have a past due balance  for invoice IN0099203.

 Please provide payment as soon as possible.




 Best Regards,
 Shawneen Chisholm
 Accounts Receivable Coordinator

 UNITED RENTALS, INC.
Branch L02 BONNYVILLE
4920 56TH AVE
BONNYVILLE AB T9N 2N8 CA
780-826-7610


 CONFIDENTIALITY NOTICE: The contents of this email message and any attachments are intended solely for the addressee(s). This may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message, please alert the sender immediately by reply email and then delete this message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited

#885698#42
Date:
2023-05-12 14:36:19 UTC
From:
To:
Good morning,

 Attached please find your PDF account statement and invoice as of 05/11/2023. Please notice you have a past due balance  for invoice IN0099203.

 Please provide payment as soon as possible.




 Best Regards,
 Shawneen Chisholm
 Accounts Receivable Coordinator

 UNITED RENTALS, INC.
Branch L02 BONNYVILLE
4920 56TH AVE
BONNYVILLE AB T9N 2N8 CA
780-826-7610


 CONFIDENTIALITY NOTICE: The contents of this email message and any attachments are intended solely for the addressee(s). This may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message, please alert the sender immediately by reply email and then delete this message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited

#885698#49
Date:
2023-09-10 03:35:27 UTC
From:
To:
Hello everyone,

I come seeking your opinions.  Please cc 885698@bugs.debian.org on replies
so that we can accumulate this discussion in a Debian Policy bug.

One of the responsibilities of the Policy Editors is to determine which
licenses should be included in /usr/share/common-licenses, and thus do not
have to be reproduced in the copyright file of every package that use
them.  We have never had a clear criteria for this.  We need one, so that
we can advertise a clear and transparent policy for inclusion without
having the conversation from first principles for each new license.

I was the one who made the last few decisions, and I based the decision
largely on the number of binary packages in Debian using the license.
When I was doing this, I set a fairly high threshold (more packages than
the least popular package currently in /usr/share/common-licenses, which
historically has been GFDL-1.3 although it now appears to be MPL-1.1).  No
one was entirely satisfied with that criteria, including me.

I have the following questions:

1. What criteria (besides the obvious one of being a DFSG-free license)
   should we apply when deciding what licenses to include?  Number of
   packages?  Length?  How positive we feel towards the license?  Some
   combination of these things?  Please be specific.

2. If we use number of packages as a criteria, what should the threshold
   be?  I have appended to the bottom of this message the current output
   of my ad-hoc license-count tool run against the current archive so that
   you have a feeling for how many packages use various licenses.

3. If we use number of packages, should that be source packages or binary
   packages?  Source packages represent maintainer effort; binary packages
   represent disk clutter.

4. Should there be a length cutoff for licenses, such that we do not
   include in /usr/share/common-licenses any license shorter than some
   number of lines or bytes?  The justification would be that telling
   people to go look elsewhere for the license has some inherent overhead
   and annoyance when they discover that the license is all of ten lines
   and could have just been included in the copyright file.

5. Should we exclude licenses that contain text that all or most users of
   the license customize when they use it?  For example, the existing
   /usr/share/common-licenses/BSD contains the clause:

      3. Neither the name of the University nor the names of its
         contributors may be used to endorse or promote products derived
         from this software without specific prior written permission.

   which users of this specific license usually change to instead include
   the name of their organization, or their name, or something else.  Full
   disclosure: it will be very hard to convince me that licenses used this
   way should be included in common-licenses, since I believe it is
   technically incorrect to omit a license and point to the
   common-licenses version when the provisions of the common-licenses
   version are different in detail due to naming different people or
   requiring or prohibiting mentioning of different names as endorsements.

Here are various concerns that people have had in this area in the past.
I'm neither indicating agreement nor disagreement with any of these
points, only listing them to provoke thought about some of the things
people have raised before.

* Including long legal texts in debian/copyright, particularly if one
  wants to format them for copyright-format, is tedious and annoying and
  doesn't benefit our users in any significant way, and therefore we
  should include as many licenses as possible in common-licenses to spare
  people that work.

* common-licenses consumes disk space on every installed Debian system of
  any size, and therefore should be kept small to avoid wasting system
  resources.

* Every appproved DFSG license should be included in common-licenses so
  that it serves as a repository of licenses the project has approved.

* Including a license in common-licenses implies that the project approves
  of that license, and therefore licenses such as the LaTeX Project Public
  License 1.0, which requires renaming derived works, should not be
  included even though DFSG #4 grudgingly allows for this type of license
  term.

* All licenses explicitly mentioned in the Debian Free Software Guidelines
  should be present in common-licenses (as justification for including the
  BSD license even though the current text is specific to the Regents of
  the University of California).

In order to structure the discussion and prod people into thinking about
the implications, I will make the following straw man proposal.  This is
what I would do if the decision was entirely up to me:

    Licenses will be included in common-licenses if they meet all of the
    following criteria:

    * The license is DFSG-free.
    * Exactly the same license wording is used by all works covered by it.
    * The license applies to at least 100 source packages in Debian.
    * The license text is longer than 25 lines.

I will attempt to guide and summarize discussion on this topic.  No
decision will be made immediately; I will summarize what I've heard first
and be transparent about what direction I think the discussion is
converging towards (if any).

Finally, as promised, here is the count of source packages in unstable
that use the set of licenses that I taught my script to look for.  This is
likely not accurate; the script uses a bunch of heuristics and guesswork.

AGPL 3                  277
Apache 2.0             5274
Artistic               4187
Artistic 2.0            337
BSD (common-licenses)    42
CC-BY 1.0                 3
CC-BY 2.0                15
CC-BY 2.5                13
CC-BY 3.0               240
CC-BY 4.0               159
CC-BY-SA 1.0              8
CC-BY-SA 2.0             48
CC-BY-SA 2.5             16
CC-BY-SA 3.0            425
CC-BY-SA 4.0            237
CC0-1.0                1069
CDDL                     67
CeCILL                   30
CeCILL-B                 13
CeCILL-C                  9
GFDL (any)              569
GFDL (symlink)           55
GFDL 1.2                289
GFDL 1.3                231
GPL (any)             20006
GPL (symlink)          1331
GPL 1                  4033
GPL 2                 10466
GPL 3                  6783
LGPL (any)             5019
LGPL (symlink)          265
LGPL 2                 3850
LGPL 2.1               2926
LGPL 3                 1526
LaTeX PPL                46
LaTeX PPL (any)          40
LaTeX PPL 1.3c           32
MPL 1.1                 165
MPL 2.0                 361
SIL OFL 1.0              11
SIL OFL 1.1             258

#885698#54
Date:
2023-09-10 05:14:13 UTC
From:
To:
Quoting Russ Allbery (2023-09-10 05:35:27)

I fully support the above proposed criteria, and appreciate your
initiative to have this conversation.


 - Jonas

#885698#59
Date:
2023-09-10 05:28:27 UTC
From:
To:
I like this. I'd say that even if a license is shorter than 25 lines I'd
appreciate to be able to link to it instead of copypasting it.

I like to be able to fill the license field with a value, after checking
that the upstream license didn't diverge from what it looks like. I'd
love to use SPDX IDs there, for example. In an ideal world, I'd like to
autofill debian/copyright with SPDX IDs from upstream metadata. Having a
link to a file goes closer to having a declarative license ID.

In general the less bytes I have to maintain in debian/* the happier I
am, and as a personal aesthetic sense I feel like the less bytes we all
have to maintain in debian/* the less is our collective maintenance
burden.


Enrico

#885698#64
Date:
2023-09-10 05:41:48 UTC
From:
To:
Hideki Yamane <henrich@iijmio-mail.jp> writes:

Can we do this legally?  If we can, it certainly has substantial merits,
but I'm not sure that this satisfies the requirement in a lot of licenses
to distribute a copy of the license along with the work.  Some licenses
may allow that to be provided as a URL, but I don't think they all do
(which makes sense since people may receive Debian on physical media and
not have Internet access).

#885698#69
Date:
2023-09-10 09:00:07 UTC
From:
To:
 Hmm, how about providing license-common package and that depends on
 "license-common-list", and ISO image provides both, then? It would be
 no regressions.


 I expect license-common-list data as below

 license-short-name: URL
 GPL-2: file:///usr/share/common-licenses/GPL-2
 Boost-1.0: https://spdx.org/licenses/BSL-1.0.html

#885698#74
Date:
2023-09-10 08:49:28 UTC
From:
To:
Me too.
Agreed.

#885698#79
Date:
2023-09-10 09:25:56 UTC
From:
To:
Quoting Hideki Yamane (2023-09-10 11:00:07)

I guess Russ' response above was a concern over using http(s) URIs
towards a non-local resource.

What I practice since some years is the following syntax:

Files: foo/bar
Copyright:
  2022  Someone
License: Apache-2.0 or Expat

License: Apache-2.0
Reference: /usr/share/common-licenses/Apache-2.0

License: Expat
 [the full contents of the Expat license]

That syntax introduces a new field "Reference" (our copyright file
format permits new fields, despite lintian complaining about it).
Related discussion is at https://bugs.debian.org/786450


 - Jonas

#885698#84
Date:
2023-09-10 05:34:43 UTC
From:
To:
 How about just pointing SPDX licenses URL for whole license text and
 lists DFSG-free licenses from that? (but yes, we should adjust short
 name of licenses for DEP-5 and SPDX for it).

#885698#89
Date:
2023-09-10 12:18:20 UTC
From:
To:
+1, great work and great starting point.

I also agree with Enrico and I'd like lower limits too, but any
progress is good progress on this matter for me.

#885698#94
Date:
2023-09-10 16:00:22 UTC
From:
To:
Jonas Smedegaard <jonas@jones.dk> writes:

I do wonder why we've never done this.  Does anyone know?  common-licenses
is in an essential package so it doesn't require a dependency and is
always present, and we've leaned on that in the past in justifying not
including those licenses in the binary packages themselves, but I'm not
sure why a package dependency wouldn't be legally equivalent.  We allow
symlinking the /usr/share/doc directory in some cases where there is a
dependency, so we don't strictly require every binary package have a
copyright file.

Yes, I think the https URL is an essential part of the first proposal,
since it avoids needing to ship a copy of all of the licenses.  But I'm
dubious that would pass legal muster.

The alternative proposal as I understand it would be to haave a
license-common package that includes full copies of all the licenses with
some more relaxed threshold requirement and have packages that use one of
those licenses depend on that package.  (This would obviously require a
maintainer be found for the license-common package.)

This is separate from this particular bug, but I would love to see the
pointer to common-licenses turned into a formal field of this type in the
copyright format, rather than being an ad hoc comment.

#885698#99
Date:
2023-09-10 16:16:07 UTC
From:
To:
Russ Allbery <rra@debian.org> writes:

In the thread so far, there's been a bit of early convergence around my
threshold of 100 packages above.  I want to make sure people realize that
this is a very conservative threshold that would mean saying no to most
new license inclusion requests.

My guess is that with the threshold set at 100, we will probably add
around eight new licenses with the 25 line threshold (AGPL-2,
Artistic-2.0, CC-BY 3.0, CC-BY 4.0, CC-BY-SA 3.0, CC-BY-SA 4.0, and
OFL-1.1, and I'm not sure about some of those because the CC licenses have
variants that would each have to reach the threshold independently; my
current ad hoc script does not distinguish between the variants), and
maybe 10 to 12 total without that threshold (adding Expat, zlib, some of
the BSD licenses).  This would essentially be continuing current practice
except with more transparent and consistent criteria.  It would mean not
including a lot of long legal license texts that people have complained
about having to duplicate, such as the CDDL, CeCILL licenses, probably the
EPL, the Unicode license, etc.

If that's what people want, that's what we'll do; as I said, that's what I
would do if the choice were left entirely up to me.  But I want to make
sure I give the folks who want a much more relaxed standard a chance to
speak up.

#885698#104
Date:
2023-09-10 16:29:36 UTC
From:
To:
Or we could generate DEBIAN/copyright from debian/copyright using data in
license-common-list at build time. So maintainers would not need to manage the copying
themselves.

Cheers,
Bill

#885698#109
Date:
2023-09-10 16:30:52 UTC
From:
To:
Quoting Russ Allbery (2023-09-10 18:16:07)

Good point.

Another way of reading the responses is that there was some interest in
including even more licenses.

I would also prefer inclusion of more licenses, simply had the
impression that a) we could do that step by step, and b) my habit of
writing copyright files (and other teksts) using [semantic linebreaks]
made me forget that Expat license is arguably only 3 lines long (whereas
in my style of writing it is 24-25 lines long).

If "include all SPDX licenses" is for some reason (space in minimal
systems?) problematic, then let me propose a threshold of 1000
characters - as that just about covers Expat ;-)


 - Jonas


[semantic linebreaks]: https://sembr.org/

#885698#114
Date:
2023-09-10 19:41:59 UTC
From:
To:
Jeremy Stanley <fungi@yuggoth.org> writes:
of the short licenses because historically I wasn't considering them (with
the exception of common-licenses references to the BSD license, which I
kind of would like to make an RC bug and clean up so that we could remove
the BSD license from common-licenses on the grounds that it's specific to
only the University of California and confuses people).  If we go with any
sort of threshold, the script will need serious improvements.

That was something else I wanted to ask: I've invested all of a couple of
hours in this script, and would be happy to throw it away in favor of
something that tries to do a more proper job of classifying the licenses
referenced in debian/copyright.  Has someone already done this (Jonas,
perhaps)?

#885698#117
Date:
2023-09-10 19:47:36 UTC
From:
To:
Hi,

Quoting Bill Allombert (2023-09-10 18:29:36)

I very much like this idea. The main reason maintainers want more licenses in
/usr/share/common-licenses/ is so that they do not anymore have humongous
d/copyright files with all license texts copypasted over and over again. If
long texts could be reduced to a reference that get expanded by a machine it
would make debian/copyright look much nicer and would make it easier to
maintain while at the same time shipping the full license text in the binary
package.

Does anybody know why such an approach would be a bad idea?

I have zero legal training so the only potential problem with this approach
that I was able to come up with is, that then the source package itself would
not anymore contain the license text and thus we would be shipping code covered
by a license that states that the code may only be distributed with the license
text alongside it without that text. So while auto-generating this would
probably create compliant binary packages, it would leave the source package
without the license text. Is that a problem?

Thanks!

cheers, josch

#885698#122
Date:
2023-09-10 19:14:15 UTC
From:
To:
On 2023-09-09 20:35:27 -0700 (-0700), Russ Allbery wrote:
[...]
[...]

I'm surprised, for example, by the absence of the ISC license given
that not only ISC's software but much of that originating from the
OpenBSD ecosystem uses it. My personal software projects also use
the ISC license. Are you aggregating the "License:" field in
copyright files too, or is it really simply a hard-coded list of
matching patterns?

Regardless, this is great work, thanks for kicking off the
reevaluation!

#885698#127
Date:
2023-09-10 20:20:07 UTC
From:
To:
At 2023-09-10T21:47:36+0200, Johannes Schauer Marin Rodrigues wrote:
[...]

...why wouldn't it?  Remember how a source package is defined:

A DSC file, an upstream source archive (maybe more than one in exciting
new source formats I haven't learned), and a compressed diff of Debian
changes.

Debian _source_ packages generally don't chop copyright notices and
license texts out the upstream distributions, and should not do so
unless those notices/texts are invalid or the material they cover has
been removed.  (Both of these do sometimes happen.)

Even if one worries about theoretical liability due to the existence of
separate files for .dsc, .tar.gz, and .diff.gz, then let us recall that
(1) the DSC is minimal, containing metadata that may not rise to the
threshold or originality required by copyright [in the U.S., anyway];
(2) the upstream archive has the notices and texts that the _original
distributor_ put in it, and as a rule, if permission to distribute the
work exists, it is not incumbent on redistributors to add notices/texts
where the rightsholder themselves neglected to do so; and (3) the
.diff.gz will not be in the business of removing notices/texts except as
contemplated in the previous paragraph (correcting erroneous
notices).[1]

I don't think that is a risk as long as people continue to follow
packaging practices that Debian has applied with little objection from
our upstreams for 25+ years.[2]

I am unable to imagine the mechanism by which that would happen, given
what Russ and Bill proposed.

Regards,
Branden

[1] When repackaging, e.g., to remove non-free material, affected
    content is removed altogether even from the source.  Nothing in
    copyright law can compel you to distribute copyright notices and
    texts that don't apply to work you're not distributing.[3]

[2] I don't know of Debian _ever_ having had a problem, as in receiving
    a cease-and-desist letter or other threat of legal action with what
    one might term an "institutional" copyright holder.  We've certainly
    had our share of nasty emails from cantankerous individual copyright
    holders, often who had their own perverse misreadings of licenses
    drafted by others (hello to the memory of Jörg Schilling).  There
    also was once an upstream who stuck a Trojan horse into the source
    code to try to get Debian's users to stop using versions we
    distributed, but to go directly upstream instead.  Nowadays, that
    seems quaint; you can today Trojan your machine much more
    conveniently with npm(1).

[3] At the same time a few non-free FSF manuals under the GNU FDL
    declaim the GNU _GPL_ text to be an Invariant Section.  Like most of
    the defects of the FDL, I think this is a pointless encumbrance; if
    you distribute GPL'ed software, a copy of its text must come along
    anyway.  The only rationale I can imagine is to mandate, for printed
    copies of the manuals, the inclusion of the GPL's preachy preamble.
    But I digress.

#885698#132
Date:
2023-09-10 20:33:51 UTC
From:
To:
Johannes Schauer Marin Rodrigues <josch@debian.org> writes:

I can think of a few possible problems:

* I'm not sure if we generate binary package copyright files at build time
  right now, and if all of our tooling deals with this.  I had thought
  that we prohibited this, but it looks like it's only a Policy should and
  there isn't a mention of it in the reject FAQ, so I think I was
  remembering the rule for debian/control instead.  Of course, even if
  tools don't support this now, they could always be changed.

* If ftp-master has to review the copyright files of each binary package
  separate from the copyright file of the source package (I think this
  would be an implication of generating the copyright files during build
  time), and the binary copyright files have fully-expanded licenses, that
  sounds like kind of a pain for the ftp-master reviewers.  Maybe we can
  deal with this with better tooling, but someone would need to write
  that.

* If we took this to its logical end point and did this with the GPL as
  well, we would add 20,000 copies of the GPL to the archive and install a
  *lot* of copies on the system.  Admittedly text files are small and
  disks are large, but this still seems a little excessive.  So maybe we
  still need to do something with common-licenses?

#885698#137
Date:
2023-09-10 20:42:04 UTC
From:
To:
* Russ Allbery <rra@debian.org> [2023-09-10 09:16]:

For me, this outcome would already be an improvement over the current
situation and alleviate my biggest pain point (CC licenses).
Still, I'd like to be significantly more relaxed.

I propose the following three criteria must be satisfied for
inclusion in /usr/share/common-licenses:

  * The license is DFSG-free.
  * Exactly the same license wording is used by all works covered by it.
  * The license is in the SPDX list of common licenses (https://spdx.org/licenses/)
    OR
    The license applies to at least 100 source packages in Debian.


I am not committed to the 100 source packages threshold, it is
mostly intended as fallback for a hypothetical future license which
is super popular but for some reason does not make it to the SPDX
list in a timely manner.

One very intentional side effect of my proposal is a nudge towards
using SPDX License Identifiers in d/copyright files.


Cheers
Timo

#885698#142
Date:
2023-09-10 20:57:02 UTC
From:
To:
Quoting Russ Allbery (2023-09-10 21:41:59)

I have so far worked the most on identifying and grouping source data,
putting only little attention (yet - but do dream big...) towards
parsing and processing debian/copyright files e.g. to compare and assess
how well aligned the file is with the content it is supposed to cover.

So if I understand your question correctly and you are not looking for
the output of `licensecheck --list-licenses`, then unfortunately I have
nothing exciting to offer.


 - Jonas

#885698#147
Date:
2023-09-10 21:24:24 UTC
From:
To:
Jonas Smedegaard <jonas@jones.dk> writes:

I think that's mostly correct.  I was wondering what would happen if one
ran licensecheck debian/copyright, but unfortunately it doesn't look like
it does anything useful.  I tried it on one of my packages (remctl) that
has a bunch of different licenses, and it just said:

debian/copyright: MIT License

and apparently ignored all of the other licenses present (FSFAP, FSFFULLR,
ISC, X11, GPL-2.0-or-later with Autoconf-exception-generic, and
GPL-3.0-or-later with Autoconf-exception-generic).  It also doesn't notice
that some of the MIT licenses are variations that contain people's names.

(I still put all the Autoconf build machinery licenses in my
debian/copyright file because of the tooling I use to manage my copyright
file, which I also use upstream.  I probably should change that, but I
need to either switch to licensecheck or rewrite my horrible script.)

Also, presumably it doesn't know about copyright-format since it wouldn't
be expecting that in source files, so it wouldn't know to include licenses
referenced in License stanzas without the license text included.

#885698#152
Date:
2023-09-11 04:45:22 UTC
From:
To:
Quoting Russ Allbery (2023-09-10 23:24:24)

Right.  Licensecheck so far mostly scans for human prose stating "this
has been licensed as..." and "this is the license...", and rarely is
able to recognize "the default license of this project is..." or "that
folder over there is licensed as..." style prose.

That said, there is interest in covering that as well, and also interest
in improving on non-prose forms like "[this is YAML;] Copyright: ..." or
binary forms most commonly embedded in fonts and ICC data in images.

It is helpful if you (i.e. anyone reading this) have a good (as in
particularly rich/tricky/peculiar) case that you file a bugreport
pointing to its failure of being recognized by licensecheck.

Also, I hadn't thought of there being interest in statistics - it should
not be too hard to spit out numbers for variation in licenses or
copyright holders once licensecheck has recognized the information.
Again, if someone has suggestions for formats they'd particularly like
such statistisc to be served from licensecheck then please file a
bugreport.

Sorry this isn't helping anything for the topic being discussed.


 - Jonas

#885698#157
Date:
2023-09-12 07:27:12 UTC
From:
To:
Hi,

 One problem is, that some software declares that they use some licenses
 (e.g. MIT), but sometimes they modify the license term itself a bit.
 So, there's a difference between words in the license list and some words
 in the included license in such software.

 It'd be better to find such software and ask upstream to fix it to use
 proper license terms, by tagging it at BTS. And, it's NOT Debian specific
 issues, so it may be better to ask folks to join such a movement then, IMHO.

#885698#162
Date:
2023-09-12 08:47:56 UTC
From:
To:
Quoting Hideki Yamane (2023-09-12 09:27:12)

I can only assume that the proposal for an automated DEBIAN/copyright
file is limited to source files *possible* to automatically process, and
consequently only relates to debian/copyright files written in the
machine-readable format.

The problem you describe about ambiguous MIT-derived licensing cannot,
in by understanding, occur using the machine-readable format - only with
less strictly structured debian/copyright files.

If you mean to say that ambiguous MIT declarations exist in
debian/copyright files written using the machine-readable format, then
please point to an example, as I cannot imagine how that would look.


 - Jonas

#885698#167
Date:
2023-09-12 16:15:27 UTC
From:
To:
Jonas Smedegaard <jonas@jones.dk> writes:
is essentially, but not precisely, the same as Expat.  If we then tell
people that they can omit the text of the license and we'll fill it in
automatically, they'll remove the actual text and we'll fill it in with
the wrong thing.

This is just a bug in handling the debian/copyright file, though.  If we
take this approach, we'll need to be very explicit that you can only use
whatever triggers the automatic inclusion of the license text if your
license text is word-for-word identical.  Otherwise, you'll need to cut
and paste it into the file as always.

#885698#172
Date:
2023-09-12 17:21:15 UTC
From:
To:
Quoting Russ Allbery (2023-09-12 18:15:27)

Ah, right.  I see it now.

Strictly speaking it is not (as I was more narrowly focusing on) that
the current debian/copyright spec leaves room for *ambiguity*, but
instead that there is a real risk of making mistakes when replacing with
centrally defined ones (e.g. redefining a local "Expat" from locally
meaning "MIT-ish legalese as stated in this project" to falsely mean
"the MIT-ish legalese that SPDX labels MIT").

If you disagree, then please shout, as then I am still missing your
point here...


 - Jonas

#885698#177
Date:
2023-09-12 17:49:02 UTC
From:
To:
Jonas Smedegaard <jonas@jones.dk> writes:

Right, the existing copyright format defines a few standard labels and
says that you should only use those labels when the license text matches,
but it doesn't stress that "matches" means absolutely word-for-word
identical.  I suspect, although I haven't checked, that we've made at
least a few mistakes where some license text that's basically equivalent
to Expat is labelled as Expat even though the text is not word-for-word
identical.  Given that currently all labels in debian/copyright are
essentially local and the full text is there (except for common-licenses,
where apart from BSD the licenses normally are used verbatim), this is not
currently really a bug.  But we could turn it into a bug quite quickly if
we relied on the license short name to look up the text.

To take an example that I've been trying to get rid of for over a decade,
many of the /usr/share/common-licenses/BSD references currently in the
archive are incorrect.  There are a few cases where the code is literally
copyrighted only by the Regents of the University of California and uses
exactly that license text, but this is not the case for a lot of them.  It
looks like a few people have even tried to say "use common-licenses but
change the name in the license" rather than reproducing the license text,
which I don't believe meets the terms of the license (although it's of
course very unlikely that anyone would sue over it).

A quick code search turns up the following examples, all of which I
believe are wrong:

https://sources.debian.org/src/mrpt/1:2.10.0+ds-3/doc/man-pages/pod/simul-beacons.pod/?hl=35#L35
https://sources.debian.org/src/gridengine/8.1.9+dfsg-11/debian/scripts/init_cluster/?hl=7#L7
https://sources.debian.org/src/rust-hyphenation/0.7.1-1/debian/copyright/?hl=278#L278
https://sources.debian.org/src/nim/1.6.14-1/debian/copyright/?hl=64#L64
https://sources.debian.org/src/yade/2023.02a-2/debian/copyright/?hl=78#L78

An example of one that probably is okay, although ideally we still
wouldn't do this because there are other copyrights in the source:

https://sources.debian.org/src/lpr/1:2008.05.17.3+nmu1/debian/copyright/?hl=15#L15

This problem potentially would happen a lot with the BSD licenses, since
the copyright-format document points to SPDX and SPDX, since it only cares
about labeling legally-equivalent documents, allows the license text to
vary around things like the name of the person you're not supposed to say
endorsed your software while still receiving the same label.

We therefore cannot use solely SPDX as a way of determining whether we can
substitute the text of the license automatically for people, because there
are SPDX labels for a lot of licenses for which we'd need to copy and
paste the exact license text because it varies.  At least if I understand
what our goals would be.

(License texts that have portions that vary between packages they apply to
are a menace and make everything much harder, and I really wish people
would stop using them, but of course the world of software development is
not going to listen to me.)

#885698#182
Date:
2023-09-12 18:50:25 UTC
From:
To:
Note that my proposal makes detecting the discrepancy more visible rather
than less, since you can compare the generated copyright file with
the actual license statement without chasing files.

Also, overengineering aside, the copyright generator could support
parameter substitution to accomodate small discrepancies in license.
For example an option to replace in /usr/share/common-licenses/BSD the
line
"Copyright (c) The Regents of the University of California."
by whatever is required when generating DEBIAN/copyright.

Cheers,
Bill

#885698#187
Date:
2023-09-19 10:55:18 UTC
From:
To:
Hopefully I'm not too late and I hope I won't make any ('dumb') mistakes as
I'm not as well-versed in licenses and packaging as other participants.

I think both of these criteria are excellent.

The only reason for not doing so that I've detected is worry about disk space?
If we were talking about several Megabytes (or even larger) then I could see
that point. But license text is max several Kilobytes?

diederik@bagend:/usr/share/doc$ find . -name copyright | wc -l
3759

I suspect I have an enormous amount of duplicate license texts on this system
and replacing those with references to common-licenses will likely reduce the
waste of system resources.

Optionally the license texts in common-licenses could be gz compressed (gzip
is Priority: required) to reduce disk-space even further.

So I would be in favor of dropping the threshold.

The primary reason I'm in favor of dropping this too is consistency.

This is an important reason why I'd want to have most/all licenses that are
used in Debian included in common-licenses.
It's not only tedious and annoying, but also (because of that) error prone.
And then you run the risk of the included license text not being (word-for-
word) the same.
Getting rid of tedious/annoying/repeating busy work seems like a win for
everyone.

And IMO it's not only not beneficial to our users, but actually provides extra
work. If I want to make sure the license text is indeed the same as my
(hopefully correct) local copy, I'd have to run a `diff` with the included text
in the copyright file. And that applies to every user who'd want to do that.
And also for a prospective (new) maintainer of a package.

I'm a (big) fan of SPDX because it simplifies and clarifies things (a lot IMO)
and makes things more consistent. And I'm a sucker for consistency.

I do think that the license should be provided locally (and its availability
not be dependent on a build step in some other tool).
Having a link to an online version may be a useful extra service, but having a
working internet connection should not be a requirement (IMO).

Cheers,
  Diederik

#885698#192
Date:
2023-10-09 17:31:00 UTC
From:
To:
Hello Russ,

Thank you for working on this.

Something that hasn't been brought up yet is the effects on NEW review.
I would like to expand the idea of the same license wording being used
by all works, to include the additional requirement that there aren't
any very similar licenses that are easily confused with the license.

For, if it's a license with small variations of any kind, including
variations that are not project-specific things like the names of
copyright holders, then NEW review is much easier if all the text is
right there in d/copyright.

I would be in favour of the 25 lines criterion.  The main problem with
manipulating d/copyright is only the really long licenses, IME.

#885698#197
Date:
2024-07-19 23:25:32 UTC
From:
To:
Attention,

I do have a business which l know will be tremendously profitable to both of us, if you will be interested, please get back to me for more details.
Sincerely,
Mr. Edric Reed

#885698#202
Date:
2024-09-30 16:41:03 UTC
From:
To:
Hi Russ and Sean,

thanks for for working on this. Just today I worked on a package having
some CC-BY-SA-4.0 licensed content and wasn't too glad at having to copy
the full license. Are there any big blockers for this ? Reading the
previous discussion the techicalities seem to be mostly agreed upon
(unless I missed something ?).
I think this would be a big improvement for packagers. Let me know if
you need help finalizing any discussion to make this policy.

best,

Matthias Geiger <werdahias>

#885698#207
Date:
2024-09-30 18:23:41 UTC
From:
To:
I suggested a tool that would copy the full license inside the binary package copyright file
at build time. This seems a more sustainable option.

Cheers,

#885698#212
Date:
2024-10-31 15:00:56 UTC
From:
To:
Hello,

I am Christine Edward, I have a proposal I believe would be of great interest to you. I would appreciate your swift response to enable me to share more details with you.

Best regards,
Ms. Christine Edward.

#885698#217
Date:
2024-10-31 15:00:56 UTC
From:
To:
Hello,

I am Christine Edward, I have a proposal I believe would be of great interest to you. I would appreciate your swift response to enable me to share more details with you.

Best regards,
Ms. Christine Edward.

#885698#224
Date:
2026-04-27 08:02:01 UTC
From:
To:
Hello

As a member of the new DFSG team, I would like to restart the discussion
from September 2023..

The issue at hand is the inclusion of additional license texts in the
base-files package.

In doing so, I would like to continue Russ Albery’s proposal [0].

Licenses will be included in common-licenses if they meet all of the
following criteria:

     * The license is DFSG-free.
     * Exactly the same license wording is used by all works covered by it.
     * The license applies to at least 100 source packages in Debian.
     * The license text is longer than 25 lines.

It also lists various reasons why it makes sense to include additional
license texts.

* Including long legal texts in debian/copyright, particularly if one
   wants to format them for copyright-format, is tedious and annoying and
   doesn't benefit our users in any significant way, and therefore we
   should include as many licenses as possible in common-licenses to spare
   people that work.

* common-licenses consumes disk space on every installed Debian system of
   any size, and therefore should be kept small to avoid wasting system
   resources.

The above reasons also make it easier for the DFSG team to review the
packages.

Even on our own systems, the licenses in question meet these criteria.

[0] https://lists.debian.org/debian-devel/2023/09/msg00055.html

#885698#229
Date:
2026-04-27 08:08:28 UTC
From:
To:
* Mechtilde Stehmann <mechtilde@debian.org> [260427 10:02]:

Great. Let me state some IMO relevant questions:

1) What was the outcome of the 2023 discussion?

2) If nothing has changed, why?

3) What is the current dataset?

Best,
Chris

#885698#234
Date:
2026-04-27 08:41:39 UTC
From:
To:
Hi all,

Maybe it'd make sense to restrict this to licenses which also included
in Essential packages, or ones with an high enough priority (like
Important), so that extra disk usage for base-files. is less of
a concern.

Bye!

#885698#239
Date:
2026-04-27 09:01:28 UTC
From:
To:
hello

Am 27.04.26 um 10:08 schrieb Chris Hofstaedtler:

They agreed to include more licenses in /usr/share/common-licenses. No
objections were raised.

But nothing was changed.

I guess because nobody want to do the work. This is my personal opinion

At a first task I think theses licenses should be added. They fullfill
the criteria Russ posted and at my local machine.

Artistic-2.0
AGPL-3
BSL-1.0
CC-BY-3.0
CC-BY-4.0
CC-BY-SA-3.0
CC-BY-SA-4.0
OFL-1.1

The Artistic License in /usr/share/common-licenses is version 1.0

Regards

#885698#244
Date:
2026-04-27 12:52:41 UTC
From:
To:
I think that there was a consensus back then and there is still one now.
Do you volunteer to NMU base-files, if the maintainer is not interested
in working on this?
The list of licenses to be added is small enough that this should not be
a concern. But maybe you have different data?

#885698#249
Date:
2026-04-27 13:09:54 UTC
From:
To:
Uh, what? I'm pretty certain Santiago would be happy to update
base-files *once* debian-policy has been updated, but certainly not
before. So instead of unnecessarily throwing shade, perhaps get
debian-policy updated first?

Thanks,
Guillem

#885698#254
Date:
2026-04-27 13:22:53 UTC
From:
To:
Indeed.

Here is a quote from base-files FAQ for those who never bothered to read it:


Q. Why isn't license "foo" included in common-licenses?

A. I delegate such decisions to the policy group. If you want to
propose a new license you should make a policy proposal to modify the
paragraph in policy saying "Packages distributed under the Apache
license (version 2.0), the Artistic license, the GNU GPL (versions 1,
2, or 3), the GNU LGPL (versions 2, 2.1, or 3), and the GNU FDL
(versions 1.2 or 1.3) should refer to the corresponding files under
/usr/share/common-licenses". The way of doing this is explained in the
debian-policy package. As usual, you should always take a look at
already reported bugs against debian-policy before submitting a new
one.


If somebody has a problem with me delegating the decision to the
policy group, they should say so in a clear and non ambiguous way.

Thanks.

#885698#259
Date:
2026-04-27 15:28:57 UTC
From:
To:
Hello,


Am 27.04.26 um 15:22 schrieb Santiago Vila:


You can find the following text under

https://salsa.debian.org/sanvila/base-files/-/blob/master/debian/README

#885698#264
Date:
2026-04-27 15:45:15 UTC
From:
To:
Chris Hofstaedtler <zeha@debian.org> writes:

I dropped the ball. I would very much welcome someone else pushing this
forward, since I am way, way behind on all of my volunteer work.

#885698#269
Date:
2026-04-27 16:48:19 UTC
From:
To:
Do you mean the license text or the license itself ?

For the remaining licenses, I have made a proposal to fill in debian/copyright
at build time from the list of SPDX identifier.
This is still an option if the NEW team does not reject it.

Cheers,

#885698#274
Date:
2026-04-28 09:20:16 UTC
From:
To:
Santiago Vila [27/Apr  3:22pm +02] wrote:

I'm sorry, I just posted to #1135097 stating the opposite ..

I think that the "at least 100 packages" part of this proposal is too
low a bar.  But perhaps the traditional "deduplicating it would save
disk space on the majority of Debian installations" is too high a bar.

Any thoughts on something in between?

#885698#279
Date:
2026-04-28 09:39:58 UTC
From:
To:
I think that the proposal if fine, since it adds a quite small number of
licenses. If people disagree then please bring measurements.

#885698#284
Date:
2026-04-28 10:08:56 UTC
From:
To:
why do you think so? I think 100 is quite a high bar already.
(assuming we talk source packages.)

#885698#289
Date:
2026-04-28 10:26:18 UTC
From:
To:
Holger Levsen [28/Apr 10:08am GMT] wrote:

Those 100 could easily be packages that most systems don't have
installed -- or, in particular, that systems that are trying to be
really small almost never have installed.

Really we need to hear from the people who are trying to make the
minimal install of Debian small.  That's not me.

#885698#294
Date:
2026-04-28 11:11:30 UTC
From:
To:
* Sean Whitton <spwhitton@spwhitton.name> [260428 12:26]:

I have a mild interest in keeping small installs small, but I'm
certainly not an expert. I've however done some poking.

Looking at the copyright files of packages installed by `mmdebstrap
forky /dev/null` - IOW a set of packages that can be expected that
every 'normal' install of Debian has (excluding container and
embedded usecases which can and will apply hacks) - yields a few
interesting things:

1) libc, sed have "Boost Software License - Version 1.0 - August
17th, 2003" in their copyright files. Adding this to common-licenses
seems a net positive and could IMO be done immediately without any
negative effects.

2) mawk, libunistring5 use CC-BY-SA 3.0
These packages can be uninstalled. However curl depends on
libunistring5, so once your install wants to talk to the Internet it
probably has to stay.

3) nftables uses CC-BY-SA 4.0
This package can be uninstalled, but again once you want network
connectivity, ...

4) AGPLv3 is NOT present

5) Deduplicating copyright files might be a meaningful disk space
saving, if we actually care about disk space savings.
The install per above has:

* 10 binary packages from src:util-linux adding 30KB copyright per binary
* 6 binary packages from src:systemd adding 13KB copyright per binary
* 5 binary packages from src:e2fsprogs adding 20KB copyright per binary
* 4 binary packages from src:pam adding 10KB copyright per binary
* 4 binary packages from src:krb5 adding 63KB copyright per binary
ff.

I haven't done a full calculation but it seems we could save 1MB in
such an install just by deduplicating the copyright files. Someone
else may be interested in running the same analysis on different
install scenarios, say Live ISOs, Desktop installs, etc.

With my src:util-linux maintainer hat on, I'd welcome tooling and a
corresponding policy change towards copyright file deduplication.

And/or compression might also be of interest.

6) Even in this install scenario we still have some packages not using
https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ :

   debian-archive-keyring
   gcc-16-base
   libcrypt1
   libgcc-s1
   libgssapi-krb5-2
   libk5crypto3
   libkrb5-3
   libkrb5support0
   libstdc++6


Best,
Chris

#885698#299
Date:
2026-04-28 12:15:35 UTC
From:
To:
Could common-licenses directory be compressed?
That would save a lot of space.

common-licenses is 295k on my system,
but the whole of base files (compressed package) is only 73k

Cheers,
Peter

#885698#304
Date:
2026-04-28 12:32:32 UTC
From:
To:
in Policy 12.5). That would save even more space, it's 63 MB in total on
my machine and 2.1 MB on the aforementioned minimal forky.

I've searched for a policy bug and of course there was one: #491055, filed
in 2008, no discussions after 2008, closed in 2017 for inactivity.

#885698#309
Date:
2026-04-28 13:24:58 UTC
From:
To:
[ trimming Cc lines a little bit, I will read replies from the lists ]

Ok, let's make some kind of declaration here:

The base-files package has not had any new license added in a lot of
years, while the rest of the system has continued to grow exponentially,
as usual. Note that I'm using the word "exponentially" here in the pure
mathematical sense, not in the English sense that "it grows too much".

Most licenses are not really large files by today's standards.

To be consistent, compressing licenses would force all references to
be changed to the compressed version, IMO for very little gain.

I'm more concerned about people copying and pasting the same licenses
over and over again into debian/copyright, as pointed out by Mechtilde.

Therefore, please do not worry about the increased size in the
installed size of base-files after this proposal is approved.
I think we definitely can afford it.

Thanks.

#885698#314
Date:
2026-04-28 13:30:29 UTC
From:
To:
The reason is that the Debian policy team expected this could only be settled
by the NEW team and the legal team so there was no point to debate either way.

Cheers,

#885698#319
Date:
2026-04-28 14:10:26 UTC
From:
To:
I fully agree. My tool of choice here is dh_installdocs --link-doc.

E.g. from kmod:

override_dh_installdocs:
         dh_installdocs -pkmod -plibkmod-dev --link-doc=libkmod2
         dh_installdocs -plibkmod2

Beware: enabling this on an existing package also requires using
dpkg-maintscript-helper (debian/*.maintscript) with dir_to_symlink.

#885698#324
Date:
2026-04-28 14:12:14 UTC
From:
To:
Hello,

Am 28.04.26 um 15:24 schrieb Santiago Vila:

ACk

I prepare a repository locally for the potential Merge Request.

All 8 licenses need ~150 KB

That is less space as they used in an Install ISO than it is published
with each package


If there is consent to follow the proposal with Simon's update in
#1135097 I can prepare a Merge Request with the additional license texts.

Thanks for the constructive discussion.

Kind regards

#885698#329
Date:
2026-04-28 16:29:56 UTC
From:
To:
Santiago Vila <sanvila@debian.org> writes:

Yeah, I agree with this position. I think license texts are small even by
the standards of embedded systems these days. Disk space growth has
continued since previous rounds of this discussion, human time is more
valuable than a few extra bytes of disk consumption, and we're talking
about on the order of 1MiB at most (I suspect less than that). Compared to
the size of the Debian base image, this is very small.

Folks who actively work on embedded Debian should of course feel free to
correct me, but my recollection of past discussions is that they had
roughly the same position.

I think even in the worst case scenario of a system with a ton of Debian
chroots, the incremental size here is highly unlikely to be a significant
factor compared to, e.g., normal growth in the size of the utilities in
the base image. And of course the local system administrator can always
rm -r /usr/share/common-licenses if they really want to. (I doubt anything
important uses files there at runtime.)

#885698#334
Date:
2026-04-28 18:15:50 UTC
From:
To:
On Tue, Apr 28, 2026 at 09:29:56AM -0700, Russ Allbery wrote:
[..]

It might be sensible to have policy allow for this, and thus require
that no packages *use* these files during their normal operation
(and also not in maintscripts, etc).

Except maybe for tools explicitly designed to operate on them (say,
license checkers, devscripts). I hope there is pre-established
wording in policy that could be reused for such an exception.

Chris

#885698#339
Date:
2026-04-28 20:10:46 UTC
From:
To:
I was one of these people. I just deleted /usr/share/doc entirely so I
don't think a couple kb more in there would make any difference.
Certainly it wouldn't have made it for me.

#885698#344
Date:
2026-04-29 04:33:44 UTC
From:
To:

We are talking about /usr/share/common-licenses which is not in
/usr/share/doc ;-)

cu Andreas

#885698#349
Date:
2026-04-29 09:38:49 UTC
From:
To:
Is there objections to using SPDX abbreviations for the file names of
licenses in base-files?

I didn't double-check if that's in the proposal, but I think that's how
it should be done.

If we already deviate from SPDX names, then maybe moving existing files
to SPDX-names, and recommending use of those names, and set up a symlink
would be an improvement.  Or grandfather in them as exceptions, to avoid
unnecessary debian/copyright churn.

It would be nice if SPDX names was mentioned in debian-policy or
base-files/debian/README.source, so we don't forget about this aspect in
the future.  We can always change that policy later on if it turns out
to be a bad idea for some reason (if someone registers FOO`rm -rf /` as
a SPDX license name, perhaps).

/Simon

#885698#354
Date:
2026-04-29 10:03:13 UTC
From:
To:
More generally we could have two packages:
base-files with /usr/share/common-licenses/
and a new package
spdx-license with /usr/share/spdx-licenses/ with all SPDX license used by Debian.
and have a tool that build debian/copyright from spdx-license at build time, so
spdx-license would only be needed when building packages.

Cheers,

#885698#359
Date:
2026-04-29 10:14:57 UTC
From:
To:
Bill Allombert <ballombe@debian.org> writes:

I think that is orthogonal -- but I also think the suggestion is good.

Doesn't 'spdx-licenses' provide this, though?  Maybe not the "tool that
build debian/copyright" part though, but that could be done separately.

https://tracker.debian.org/pkg/spdx-licenses

/Simon

#885698#364
Date:
2026-04-29 17:44:51 UTC
From:
To:
Simon Josefsson <simon@josefsson.org> writes:

In general, I think this is a good idea, but I think it's mostly
meaningful in combination with adopting SPDX license abbreviations across
the board, including in the copyright-format standard. To be clear, I
agree with doing that, but I think it has the most value if it's not done
piecemeal, since ideally the file names in common-licenses should match
the names we use in copyright-format.

(Some symlinking may be required if we have to rename anything; I haven't
checked if that would be the case.)

#885698#369
Date:
2026-04-30 09:35:52 UTC
From:
To:
Chris Hofstaedtler [28/Apr  1:11pm +02] wrote:

Thank you for the feedback.  Seems to me we can prioritise developer
time by adding more licenses to common-licenses, then, with the possible
exception of the AGPL.

#885698#374
Date:
2026-04-30 11:43:09 UTC
From:
To:
Hello all,

Am 30.04.26 um 11:35 schrieb Sean Whitton:

This exception of the AGPL means we will only add 120 KB instead of 150
KB to /usr/share/common-licenses ?


Kind regards

#885698#379
Date:
2026-04-30 11:53:34 UTC
From:
To:
In case my opinion counts:

I think AGPL is common enough and will still save developer time if
added to common-licenses, even if it's not present in the absolutely
minimum Debian system shown by Chris.

Thanks.

#885698#384
Date:
2026-05-01 14:23:27 UTC
From:
To:
This is a different concern that can be solved with better tooling to
generate the copyright file, by automatically including the AGPL when needed.

Cheers,

#885698#389
Date:
2026-05-01 15:05:12 UTC
From:
To:
However, that was never the idea of the original report, and not what I would
like to do.

Does somebody else believe that adding 30k to base-files is too much because
there are not packages in the base system using AGPL?

AFAIK, "common licenses" means just that, common licenses, not "common licenses
in the base system". I believe we would still benefit from adding the AGPL.

Thanks.

#885698#394
Date:
2026-05-02 02:20:30 UTC
From:
To:
Le Fri, May 01, 2026 at 05:05:12PM +0200, Santiago Vila a écrit :

Hi all,

yes, please add the AGPL-3 and the other licenses suggested by Mechtilde
to the common licenses.

For reference, I just ran license-count on coccia after applying the
attached patch.  Here is the output.  By the way, it runs takes only a
few seconds and not 30 minutes as indicated in the source code comments.

AGPL 3                  313
Apache 2.0             7087
Artistic               4270
Artistic 2.0            365
BSD (common-licenses)     3
BSL-1.0                 302
CC-BY 1.0                 3
CC-BY 2.0                16
CC-BY 2.5                11
CC-BY 3.0               256
CC-BY 4.0               249
CC-BY-SA 1.0              9
CC-BY-SA 2.0             46
CC-BY-SA 2.5             19
CC-BY-SA 3.0            461
CC-BY-SA 4.0            352
CC0-1.0                1544
CDDL                     66
CeCILL                   33
CeCILL-B                 16
CeCILL-C                 11
GFDL (any)              588
GFDL (symlink)           53
GFDL 1.2                285
GFDL 1.3                254
GPL (any)             20356
GPL (symlink)           947
GPL 1                  4168
GPL 2                 10658
GPL 3                  7321
LGPL (any)             5310
LGPL (symlink)          192
LGPL 2                 4093
LGPL 2.1               3184
LGPL 3                 1771
LaTeX PPL                52
LaTeX PPL (any)          42
LaTeX PPL 1.3a            1
LaTeX PPL 1.3c           34
MPL 1.1                 178
MPL 2.0                 502
SIL OFL 1.0              10
SIL OFL 1.1             309

The AGPL-3 and Artistic-2.0 are among the licenses promoted as
'standard' for R packages (together with GPL-2 GPL-3 LGPL-2 LGPL-2.1
LGPL-3 BSD_2_clause BSD_3_clause and MIT), which I handle a lot
recently.  https://cran.r-project.org/doc/manuals/R-exts.html#Licensing

And while I aggree to the opinions expressed here that there seems to be
no objections raised directly by users of systems under space
constraints, please note that adding CC-BY-SA-3.0 will not increase the
size of systems using GRUB, and that CC-BY-SA-4.0 licenses are found on
systems that use nftables.
https://lists.debian.org/debian-policy/2026/01/msg00010.html

It is not uncommon that I find these four license when I have to write
new debian/copyright files for r-cran-* and r-bioc-* packages, or when
their upstreams relicense their work.  I would deeply appreciate if they
could be added to the common licenses.

By the way, for the point of view of saving the time of writing, reading
and scrolling to maintainers and reviewers of new packages, maybe the
DFSG, Licensing and New packages team could run license-count regulary
and see which licenses are trending up?   Being proactive would have the
highest impact.

Have a week-end,

Charles

#885698#399
Date:
2026-05-04 19:43:35 UTC
From:
To:
* Santiago Vila <sanvila@debian.org> [260501 17:05]:

I *guess* it's gonna be fine.

As always people likely have different ideas how this selection
came to be and what the presence of the files mean. Maybe this
should be spelled out somewhere, if it's not already done.

Chris

#885698#404
Date:
2026-05-29 14:06:36 UTC
From:
To:
Hi.

After a discussion has taken place, I've decided to add the nine
licenses proposed by Mechtilde to base-files.

Including a license in base-files should be considered as a promise
that the licenses will be there indefinitely (i.e. forever in principle,
unless there is a very strong reason not to, but I can't imagine right now
what kind of reason that could be).

I understand that packages under those licenses now "may" refer to the
copies in base-files in an opt-in way, as it's an essential package.

Naturally, I expect this to become a "should" in policy as well in
some not too distant future, but there is not any hurry on my side.

In fact, I think it would be a good thing if there was some kind of
intermediate period during which maintainers receive some warning or
advance notice in a less intrusive way than a bug report (for example,
by way of a lintian warning).

Thanks.

#885698#409
Date:
2026-05-31 11:31:40 UTC
From:
To:
Santiago Vila [29/May  4:06pm +02] wrote:

Thanks.  We can document this in Policy if someone would provide an
updated patch.

It shouldn't be a 'should' yet, indeed.