- Package:
- debian-policy
- Source:
- debian-policy
- Submitter:
- Paul Hardy
- Date:
- 2023-07-02 11:45:02 UTC
- Severity:
- wishlist
- Blocked By:
-
Bug Title 885698 77
Update and document criteria for inclusion in /usr/share/common-licenses important stable testing unstable about 1 month ago
Hello, I recently formatted the Unicode Data license for the d/copyright file of a Debian package that I created. I thought I would offer it to Debian if you are interested. You probably do not want the Copyright stanza, and you might not want the Comment stanza, but I erred on the side of too much rather than too little. Unicode data files are used in a number of free software packages, such as linux-libc-dev and the Linux kernel itself. Use of Unicode data in software is likely to continue growing over time. Thus you might find this useful. Thank you, Paul Hardy
reassign 910548 debian-policy thanks Hello. According to /usr/share/doc/base-files/README, the decision to include a license or not is delegated to the Debian Policy Group, so I'm reassigning this bug. Thanks.
Duplication of such data among multiple packages does not seem like a feature, and certainly not enough duplication to justify a common-licenses entry. I would hope that most such uses could pull in these data files from a common package.
Josh, I understand your intention, but it's not that straightforward. The data that I saw in Debian packages I looked through used various pieces of property data from various files from the Unicode Consortium within pre-built arrays also containing other data, though I didn't look through all packages that used Unicode data by any means. In my case, I used Unicode code point descriptions in the comment fields of lex patterns (flex on Debian) in my beta2uni program (part of my unibetacode package), which converts Beta Code to Unicode. Here are a few such lines of code: \*\/[Aa] print_pattern (yytext, 0x0386); /* GREEK CAPITAL LETTER ALPHA WITH TONOS */ \*\/[Ee] print_pattern (yytext, 0x0388); /* GREEK CAPITAL LETTER EPSILON WITH TONOS */ \*\/[Hh] print_pattern (yytext, 0x0389); /* GREEK CAPITAL LETTER ETA WITH TONOS */ \*\/[Ii] print_pattern (yytext, 0x038A); /* GREEK CAPITAL LETTER IOTA WITH TONOS */ \*\/[Oo] print_pattern (yytext, 0x038C); /* GREEK CAPITAL LETTER OMICRON WITH TONOS */ \*\/[Uu] print_pattern (yytext, 0x038E); /* GREEK CAPITAL LETTER UPSILON WITH TONOS */ \*\/[Ww] print_pattern (yytext, 0x038F); /* GREEK CAPITAL LETTER OMEGA WITH TONOS */ etc. I used the utf8gen program (another package that I wrote and then debianized) to create those lines of code, typing in the regular expressions myself by hand after utf8gen did the monotonous work of printing everything to the right of those patterns on each line for me from data that I had pre-extracted from a Unicode data file. I had to have the Unicode names in front of me to type the correct regular expression for each code point. The way I did that also will help me or anyone else debug the program in the future. Were I to attempt to pull such comment strings from another package on the fly, I would have to write a program that knew which lines in my source code needed those comment strings, fetch them from said external package, and create a new source code file for lex/flex before building the final program. Apart from the most obvious immediate inconveniences of doing that, two others come to mind: 1) I could not then produce the source file in final form without running on a distro such as Debian that implemented a packaging scheme, or providing complicated build instructions for an end user (most likely a student of ancient Greek who would not have deep knowledge of building software packages). As implemented, my unibetacode package builds and installs on many distros just the way it is, including on non-GNU/Linux systems thanks to the modern miracle of GNU Autotools. 2) I would have to perform such a partial build just to read the comments that I intended for debugging (and I would have had to resort to an external table while typing in the generating regular expressions rather than having them conveniently on the same line of code). There would also be the impracticality of telling such groups as the Linux kernel developers and other upstream teams that they must switch to using the Unicode package that Debian provides for their future builds. OTOH, packaging the Unicode data files could be useful for other, unrelated purposes. Of course, such a package would be one more instance of the need for the Unicode Consortium's license and (lengthy) copyright information in yet one more package's debian/copyright file. :-) Yet that still doesn't answer the question of whether or not Debian would find such a common file of Unicode license & copyright terms useful...but the text is there if Debian makes that decision. If not, at least I took the time to make it available. Thanks, Paul Hardy
Paul Hardy <unifoundry@gmail.com> writes: in some of these packages may be of a different nature, but it might be worth pointing out that data such as what you show above is probably not copyrightable under at least US law (and I think that exception is fairly common internationally). It therefore probably isn't meaningfully under any license, although including the Unicode Data License in the package doesn't hurt. Disclaimer: I'm not a lawyer, let alone a copyright lawyer.
That was my thinking. Plus it would have been acutely embarrassing to run afoul of the Unicode Consortium's license terms, even if unintentional, after they so kindly gave me a lifetime membership gratis[1]. :-) Me neither, so I erred on the side of caution. Thanks, Paul Hardy [1] http://unicode.org/consortium/members.html
Unicode's new version for 2019 is attached, with data files in http://www.unicode.org/ivd/data/ explicitly mentioned as covered under the license. The source text is at http://www.unicode.org/copyright.html. Thanks, Paul Hardy
According to my wrangling of codesearch.debian.net, unicode.org gets mentioned in over 1,000 packages and it's mostly about this data. I think that's enough to merit inclusion in common-licenses. Scott K
Paul Hardy <unifoundry@gmail.com> writes:
Hi Paul,
It looks like you included the entire statement from the web site, which I
think is intended to cover the whole web site. As near as I can tell, the
files that Debian is packaging are the ones referenced by this stanza:
4. Further specifications of rights and restrictions pertaining to the
use of the particular set of data files known as the "Unicode
Character Database" can be found in the License.
and therefore appear to only be covered by this license instead:
https://www.unicode.org/license.html
The full license that you formatted includes a bunch of other clauses like
choice of venue and unilateral license changes that I don't think are
intended to cover the things that we're packaging. I think we should
therefore consider incorporating only the above text instead?
Scott Kitterman <debian@kitterman.com> writes:
Could you provide more detail on your search? I searched for:
path:debian/copyright DOWNLOADING, INSTALLING, COPYING OR OTHERWISE
USING UNICODE INC
and only found 26 packages. I'm not sure that's enough to warrant
inclusion in common-licenses.
-- Hello Dear, how are you today? I hope you are fine My name is Dr. Ava Smith, I Am an English and French nationality. I will give you pictures and more details about me as soon as I hear from you Thanks Ava