#910548 base-files - please consider adding /usr/share/common-licenses/Unicode-Data

Package:
debian-policy
Source:
debian-policy
Submitter:
Paul Hardy
Date:
2023-07-02 11:45:02 UTC
Severity:
wishlist
Blocked By:
Bug Title
885698

  77

Update and document criteria for inclusion in /usr/share/common-licenses

important stable testing unstable about 1 month ago

#910548#5
Date:
2018-10-07 23:25:39 UTC
From:
To:
Hello,

I recently formatted the Unicode Data license for the d/copyright file
of a Debian package that I created.  I thought I would offer it to
Debian if you are interested.  You probably do not want the Copyright
stanza, and you might not want the Comment stanza, but I erred on the
side of too much rather than too little.

Unicode data files are used in a number of free software packages,
such as linux-libc-dev and the Linux kernel itself.  Use of Unicode
data in software is likely to continue growing over time.  Thus you
might find this useful.

Thank you,


Paul Hardy

#910548#12
Date:
2018-10-18 07:46:20 UTC
From:
To:
reassign 910548 debian-policy
thanks

Hello. According to /usr/share/doc/base-files/README, the decision to
include a license or not is delegated to the Debian Policy Group, so
I'm reassigning this bug.

Thanks.

#910548#19
Date:
2018-10-18 15:21:35 UTC
From:
To:
Duplication of such data among multiple packages does not seem like a
feature, and certainly not enough duplication to justify a
common-licenses entry. I would hope that most such uses could pull in
these data files from a common package.

#910548#24
Date:
2018-10-19 05:54:31 UTC
From:
To:
Josh,

I understand your intention, but it's not that straightforward.  The
data that I saw in Debian packages I looked through used various
pieces of property data from various files from the Unicode Consortium
within pre-built arrays also containing other data, though I didn't
look through all packages that used Unicode data by any means.

In my case, I used Unicode code point descriptions in the comment
fields of lex patterns (flex on Debian) in my beta2uni program (part
of my unibetacode package), which converts Beta Code to Unicode.  Here
are a few such lines of code:

\*\/[Aa] print_pattern (yytext, 0x0386);  /* GREEK CAPITAL LETTER
ALPHA WITH TONOS    */
\*\/[Ee] print_pattern (yytext, 0x0388);  /* GREEK CAPITAL LETTER
EPSILON WITH TONOS  */
\*\/[Hh] print_pattern (yytext, 0x0389);  /* GREEK CAPITAL LETTER ETA
WITH TONOS      */
\*\/[Ii] print_pattern (yytext, 0x038A);  /* GREEK CAPITAL LETTER IOTA
WITH TONOS     */
\*\/[Oo] print_pattern (yytext, 0x038C);  /* GREEK CAPITAL LETTER
OMICRON WITH TONOS  */
\*\/[Uu] print_pattern (yytext, 0x038E);  /* GREEK CAPITAL LETTER
UPSILON WITH TONOS  */
\*\/[Ww] print_pattern (yytext, 0x038F);  /* GREEK CAPITAL LETTER
OMEGA WITH TONOS    */
etc.

I used the utf8gen program (another package that I wrote and then
debianized) to create those lines of code, typing in the regular
expressions myself by hand after utf8gen did the monotonous work of
printing everything to the right of those patterns on each line for me
from data that I had pre-extracted from a Unicode data file.

I had to have the Unicode names in front of me to type the correct
regular expression for each code point.  The way I did that also will
help me or anyone else debug the program in the future.

Were I to attempt to pull such comment strings from another package on
the fly, I would have to write a program that knew which lines in my
source code needed those comment strings, fetch them from said
external package, and create a new source code file for lex/flex
before building the final program.  Apart from the most obvious
immediate inconveniences of doing that, two others come to mind:

1) I could not then produce the source file in final form without
running on a distro such as Debian that implemented a packaging
scheme, or providing complicated build instructions for an end user
(most likely a student of ancient Greek who would not have deep
knowledge of building software packages).  As implemented, my
unibetacode package builds and installs on many distros just the way
it is, including on non-GNU/Linux systems thanks to the modern miracle
of GNU Autotools.

2) I would have to perform such a partial build just to read the
comments that I intended for debugging (and I would have had to resort
to an external table while typing in the generating regular
expressions rather than having them conveniently on the same line of
code).

There would also be the impracticality of telling such groups as the
Linux kernel developers and other upstream teams that they must switch
to using the Unicode package that Debian provides for their future
builds.


OTOH, packaging the Unicode data files could be useful for other,
unrelated purposes.  Of course, such a package would be one more
instance of the need for the Unicode Consortium's license and
(lengthy) copyright information in yet one more package's
debian/copyright file. :-)

Yet that still doesn't answer the question of whether or not Debian
would find such a common file of Unicode license & copyright terms
useful...but the text is there if Debian makes that decision.  If not,
at least I took the time to make it available.

Thanks,


Paul Hardy

#910548#29
Date:
2018-10-19 06:22:26 UTC
From:
To:
Paul Hardy <unifoundry@gmail.com> writes:
in some of these packages may be of a different nature, but it might be
worth pointing out that data such as what you show above is probably not
copyrightable under at least US law (and I think that exception is fairly
common internationally).  It therefore probably isn't meaningfully under
any license, although including the Unicode Data License in the package
doesn't hurt.

Disclaimer: I'm not a lawyer, let alone a copyright lawyer.

#910548#34
Date:
2018-10-19 07:19:10 UTC
From:
To:
That was my thinking.  Plus it would have been acutely embarrassing to
run afoul of the Unicode Consortium's license terms, even if
unintentional, after they so kindly gave me a lifetime membership
gratis[1]. :-)

Me neither, so I erred on the side of caution.

Thanks,


Paul Hardy

[1] http://unicode.org/consortium/members.html

#910548#39
Date:
2019-01-26 16:47:52 UTC
From:
To:
Unicode's new version for 2019 is attached, with data files in
http://www.unicode.org/ivd/data/ explicitly mentioned as covered under
the license.  The source text is at
http://www.unicode.org/copyright.html.

Thanks,


Paul Hardy

#910548#44
Date:
2019-01-31 06:47:36 UTC
From:
To:
According to my wrangling of codesearch.debian.net, unicode.org gets mentioned
in over 1,000 packages and it's mostly about this data.  I think that's enough
to merit inclusion in common-licenses.

Scott K

#910548#51
Date:
2021-04-01 23:14:22 UTC
From:
To:
Paul Hardy <unifoundry@gmail.com> writes:

Hi Paul,

It looks like you included the entire statement from the web site, which I
think is intended to cover the whole web site.  As near as I can tell, the
files that Debian is packaging are the ones referenced by this stanza:

    4. Further specifications of rights and restrictions pertaining to the
       use of the particular set of data files known as the "Unicode
       Character Database" can be found in the License.

and therefore appear to only be covered by this license instead:

https://www.unicode.org/license.html

The full license that you formatted includes a bunch of other clauses like
choice of venue and unilateral license changes that I don't think are
intended to cover the things that we're packaging.  I think we should
therefore consider incorporating only the above text instead?

Scott Kitterman <debian@kitterman.com> writes:

Could you provide more detail on your search?  I searched for:

    path:debian/copyright DOWNLOADING, INSTALLING, COPYING OR OTHERWISE
    USING UNICODE INC

and only found 26 packages.  I'm not sure that's enough to warrant
inclusion in common-licenses.

#910548#56
Date:
2023-07-02 11:40:55 UTC
From:
To:
-- 
Hello Dear,
how are you today? I hope you are fine
My name is Dr. Ava Smith, I Am an English and French nationality.
I will give you pictures and more details about me as soon as I hear from you
Thanks
Ava