- Package:
- licensecheck
- Source:
- licensecheck
- Submitter:
- Jan Hauke Rahm
- Date:
- 2025-04-08 11:39:02 UTC
- Severity:
- wishlist
- Tags:
- Blocked By:
-
Bug Title 828948 3
licensecheck: file parsing: extract strings from binary files with binwalk or hachoir wishlist stable testing unstable about 1 year ago
960694 4
licensecheck: detect and skip binary files by default wishlist stable testing unstable about 1 year ago
828941 5
licensecheck: file parsing: extract strings from binary files with binwalk or hachoir wishlist stable testing unstable about 1 year ago
960695 4
licensecheck: add usage option, to covermore than current "only look for source code" wishlist stable testing unstable about 1 year ago
837973 4
licensecheck: file parsing: extract metadata from fonts normal stable testing unstable about 1 year ago
526701 5
licensecheck: options: recursive scan should check all files (--check=".*" by default) normal stable testing unstable about 3 years ago
545193 5
licensecheck: options: recursive scan should check all files (--recursive should imply --check=".*") normal stable testing unstable about 3 years ago
Dear devscripts devel team, since there is a proposal to change debian/copyright files to a machine-interpretable format (described at [1]) I'd like to see licensecheck to autogenerate such a file. This would of course not prevent the maintainer from checking the license and copyright information by himself but but it could do the groundwork for him. If there is already something planned like that, I'm sorry for this bug. Thanks for your work! Hauke [1] http://wiki.debian.org/Proposals/CopyrightFormat
Dear devscripts developers, I'm not really a perl hacker but I did a try on adding this feature to licensecheck myself. It is *NOT* ready and does *NOT* do what it should. It just adds a possible way to output license information as requested by the copyright proposal. Well, I tried to comment on every sensefull line I've added so that you can clearly understand what I did. Furthermore I tried to not touch anything you wrote so I don't think that I broke anything. Although I added an option "createfile" my patch does not create a new copyright file. It just prints whatever could be put into such a file. I guess, doing the file handling is not the most difficult part of the job. :) I'd like to hear your comments on my patch. If you feel it is not worth working on it just tell me. I can stop everytime. Cheers, Hauke iE8DBQFIDJ+yGOp6XeD8cQ0RC09JAN9W/yU7VmN9KcSY66qJF+Qk4T0dYHASZxFy 7L31AN48ojn/OH0c2wgs7yncYLqReowZckrRiGjiqiYw =aNPf -----END PGP SIGNATURE-----
Hi, I've come up with a reworked patch [attached] which might help. I've intentionally removed the grouping of similarly licensed/copyrighted files to make it clear that it is a skeleton and must be checked. Also, I've added only the license keywords currently present in DEP5 http://dep.debian.net/deps/dep5/ thanks, filippo -- Filippo Giunchedi - http://esaurito.net - 0x6B79D401 It's not that I'm afraid to die, I just don't want to be there when it happens. -- Woody Allen
Hi, there has quite some time gone without any news in this bug report. I'm trying again with a new patch. It uses parts of the previously submitted patches, but differs in some aspects: * I wrote it first, and then checked the BTS for patches - my bad ;-). * I re-added the file grouping according to license+copyright similarity, as IMVHO the removal of this feature defeats the whole purpose of automatizing this information retrieval. * It parses all licenses (known to licensecheck), irrespective of their mentioning in DEP5, as otherwise the previous patch might have missed some licenses completely. Non-standardized (as seen from DEP5) license abbreviations are still prefixed by "other", for clarity. It works for me with a real world example, without breaking current functionality. Of course, there is any arbitrary amount of further polishing possible, but I think, it would be already worth the addition in the current state, as it at least helps a bit in the creation of a machine-readable debian/copyright file. Even with this, there is of course enough work left to do for the maintainer to get her/his debian/copyright file into shape. Thanks for consideration! I am of course very open to any kind of comments! Best Regards, Jan
While discussing the topic of using cdbs on pkg-multimedia, I found that
there are some interesting bits in cdbs that might have a better home in
the devscripts package. Among others is the licensecheck2dep5 scripts,
which can be found attached. It is driven by this rules in
1/rules/utils.mk, which can be trivially translated to shell or perl
,----[excerpt from 1/rules/utils.mk.in
| debian/stamp-copyright-check:
| @set -e; if [ ! -f debian/copyright_hints ]; then \
| echo; \
| echo '$(if $(DEB_COPYRIGHT_CHECK_STRICT),ERROR,WARNING): copyright-check disabled - touch debian/copyright_hints to enable.'; \
| echo; \
| $(if $(DEB_COPYRIGHT_CHECK_STRICT),exit 1,:); \
| elif [ licensecheck = $(DEB_COPYRIGHT_CHECK_SCRIPT) ] && ! which licensecheck > /dev/null; then \
| echo; \
| echo '$(if $(DEB_COPYRIGHT_CHECK_STRICT),ERROR,WARNING): copyright-check disabled - licensecheck (from devscripts package) is missing.'; \
| echo; \
| $(if $(DEB_COPYRIGHT_CHECK_STRICT),exit 1,:); \
| elif [ licensecheck = $(DEB_COPYRIGHT_CHECK_SCRIPT) ] && ! licensecheck --help | grep -qv -- --copyright; then \
| echo; \
| echo '$(if $(DEB_COPYRIGHT_CHECK_STRICT),ERROR,WARNING): copyright-check disabled - licensecheck (from devscripts package) seems older than needed 2.10.7.'; \
| echo; \
| $(if $(DEB_COPYRIGHT_CHECK_STRICT),exit 1,:); \
| else \
| echo; \
| echo 'Scanning upstream source for new/changed copyright notices...'; \
| echo; \
| echo "$(DEB_COPYRIGHT_CHECK_INVOKE) | $(_cdbs_scripts_path)/licensecheck2dep5 > debian/copyright_newhints"; \
| export LC_ALL=C; \
| $(DEB_COPYRIGHT_CHECK_INVOKE) | $(_cdbs_scripts_path)/licensecheck2dep5 > debian/copyright_newhints; \
| echo "`grep -c ^Files: debian/copyright_hints` combinations of copyright and licensing found."; \
| newstrings=`diff -a -u debian/copyright_hints debian/copyright_newhints | sed '1,2d' | egrep -a '^\+' - | sed 's/^\+//'`; \
| if [ -n "$$newstrings" ]; then \
| echo "$(if $(DEB_COPYRIGHT_CHECK_STRICT),ERROR,WARNING): The following (and possibly more) new or changed notices discovered:"; \
| echo; \
| echo "$$newstrings" \
| | perl -ne '/^.{0,60}$$/ or s/^(.{0,60})\b.*$$/$$1…/;s/[^[:print:][:space:]…]//g;$$_ ne $$prev and (($$prev) = $$_) and print' \
| | sort -m \
| | head -n 200; \
| echo; \
| echo "To fix the situation please do the following:"; \
| echo " 1) Fully compare debian/copyright_hints with debian/copyright_newhints"; \
| echo " 2) Update debian/copyright as needed"; \
| echo " 3) Replace debian/copyright_hints with debian/copyright_newhints"; \
| $(if $(DEB_COPYRIGHT_CHECK_STRICT),exit 1,:); \
| else \
| echo 'No new copyright notices found - assuming no news is good news...'; \
| fi; \
| rm -f debian/copyright_newhints; \
| fi
| touch $@
`----
Please consider merging this into the devscripts package.
I also not that an alternate implementation has been posted to this bug,
which I haven't examined yet. This shows that there is (some) common
interest in this functionality.
Hi, I developed the CDBS licensecheck2dep5 wrapper mentioned by Reinhard, and would like to both integrate those, merge with the other patches in this bugreport, and develop licensecheck further. I have requested membership of the devscripts team, and am in the process of subscribing the the mailinglist now. Hope you are fine letting me work on this. And thanks to Reinhard for pushing me out of my cave :-P - Jonas
Wonderful, thanks! An extra helping hands with devscripts is always welcome, especially if you plan to work not only on licensecheck but also to help out with some other devscripts :-P I'm no admin of the project though, so I cannot directly give you commit access ... Cheers.
Hi Jonas, our DPL is thankful and happy about your work and would welcome your helping hands with devscripts although he is no project admin and thus could not grant you commit access ... but he forgot to CC you in his reply to #472199 (possibly because he assumed you did already subscribe to the list). I'm just a devscripts user, but it looks like the standalone perl script licensecheck2dep5 Jonas wrote for CDBS could be added to devscripts in its current state if there would be a manpage for it. There are two ways I would prefer to adding another script to /usr/bin: * Placing this script in /usr/share/devscripts and add an option to licensecheck that would filter its output through licensecheck2dep5. * Moving the script into a sub of licensecheck and adding an appropriate option to licensecheck. Regards Carsten
Hi Carsten (and Zack), Indeed I did not receive the email from Zack (I blame my own mailinglist subscription fumbling for that, however - will try again after sending off this message). My intended plan for licensecheck2dep5 (currently part of cdbs) is not to ship it as-is with devscripts, but rather do as you propose as section option above: integrate with licensecheck itself, controlled by a commandline option. I notice that someone silently(?) added me to the devscripts team already, so take that as an approval of my proposal to work on this - even more official than a go from our DPL ;-) Thanks for pushing me! Please repeat if I seem too slow progressing :-) - Jonas
I was manually generating dep5 copyright file for ruby-sigar and thought it would be nice to be able to auto-generate it (almost all files have different copyright terms, though most of them is apache 2.0). I was going to file a wishlist bug, but found this. I would like to know if it is planned for a new devscripts version. Thanks Praveen
Hi Jonas, I've just discovered /usr/lib/cdbs/licensecheck2dep5, thanks to a reply from Vincent Cheng [1]. [1] https://lists.debian.org/debian-mentors/2012/08/msg00367.html As soon as I noticed that licensecheck2dep5 is part of the cdbs package, I wondered why it was not part of devscripts (which looks like the most obvious package where this tool should be shipped). Even better, I think it should be integrated as a possible output format for licensecheck itself... Then I found out that I am not the only one with this opinion. In bug #472199 (I am Cc:ing its e-mail address), a number of different people asked for this feature to be implemented in licensecheck and you stepped in, expressing the intention to integrate this capability into licensecheck. This sounds really great! Thanks a lot! Unfortunately there seems to be no visible progress since August 2010... Is there any well-hidden progress, by chance?!? I am afraid that my Perl knowledge is just a smattering and it is rusty, too... As a consequence, I will probably not be able to help much with the implementation. But I may help with some testing and feedback, if you are interested. Please let me know. Thanks for your time!
Hi Francesco, Unfortunately only progress is in my head. :-/ I have been granted write access to devscript and have followed its mailinglist since back then, but not yet taken time to actually do this work. I am still interested in this, and expect to make time for it in the upcoming months. I have some ideas on how to support both very strict and (as seems most other than myself is interested in) more relaxed scanning. You very nudging me like here is helpful for me to bump up the priority. So thank you for that! I shall make noise on the bugreport when reaching something concrete to test, so please subscribe to that if you wanna help out. I also kinda expect others that might wanna dive in to tackle this to make noise at the bugreport. That way hopefully we avoid wasting time on multiple concurrent efforts (in case I continue to not come of with something). - Jonas
On Wed, 29 Aug 2012 10:24:19 +0200 Jonas Smedegaard wrote: [...] [...] That's definitely well-hidden! ;-) I hope you have a good backup strategy for your head... ;-D Good, so you are ready to start... That's great news, indeed! If I understand you correctly, I think the most important mode is the "strict" one: each file has to be scanned and classified as accurately as possible. You're welcome. Or, at least, I hope I managed to do so: I am not used to subscribing to bug reports, this is the first time I try... Thanks for your time! Looking forward to having something to test. Bye.
[...] Hi again, Jonas! Any news? I've just discovered bug #597861, which seems to be an interesting read for someone (like you) willing to integrate licensecheck2dep5 into licensecheck. Especially the part [1] where a fork of licensecheck2dep5 is talked about... [1] http://bugs.debian.org/597861#31 Please consider joining forces with other people willing to address this same issue, if possible. Looking forward to receiving some (good) news from you. Bye.
This functionality is now provided by 'cme update dpkg-copyright' with cme and libconfig-model-dpkg-perl. See https://ddumont.wordpress.com/2015/04/05/improving-creation-of-debian-copyright-file/ Note that licensecheck provides one entry per scanned file. On the other 'cme' coaslesces similar entries in the same DEP-5 Files entry. This reduces a lot the size of the generated file. All the best
Thoughts below First of all I think backwards compatibility should be maintained so any change should be an option. I was only asking about the translation of the license tags? But then if you translate the license tags it would cost nothing to go to a crude bloated DEP-5 format. But since I think we agree that we do not wish to attempt to make licensecheck to produce "good" DEP-5 it might be best to avoid moving it in that direction. yes this would be prompt people to raise wishlist bugs for consolidation. slippery slope. Yes. Still I can't make my mind up. done! I am slightly puzzled why https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=519080 is a wishlist bug and not a full blown bug.
Dear Customer, We can not deliver your parcel arrived at January 22. Please review delivery label in attachment! Thanks, Mark Werner, USPS Chief Station Manager.
Dear Customer, Please check your package delivery details attached! FedEx-----BEGIN PGP PUBLIC KEY BLOCK----- ablmSd7uG7FkSLfUDqE2h66/6cIqtf/4OLr6WTOm1rT4cnz3Tt9Lw1uSSVaDEqxXhOfoMvY7v4gO B0w7FMczATn14Px/+Q1Omb0B3k2aNgRESq1qhPLTHKJRF/sSud1XhZm8kQTXKv5dY8EFEhLc57dW qWnWLQpdVOHHrYYq1BDo2OlVzu6NkMEDkdzbtyek1Ss0ovGBgnisW3k7W15XUupVqj3KZxew8hYJ danp4DeQwzhlW2/FvkdUD8S0cShQh3DkTLz5lNjan1gqhPfLOFm5cqZ+eGQwh76Lyy9MWUikwJUM 5MN2LleE68OFnJVKShZpo1Rt13vMsYiifL9zy8XKvu5+0hjeCxUPYm66KjLsww7/SPFDMLy7oTVl DmUO5OX+vVKWwA0oBqcjM14wCYPc3328hlT5Vw0J4uF38kkWjL9WCD1C+sY1mKCg0ZiSz3/GOSx1 qNqbg3InU8GbJNrC9EDAPFHQvr1as6QmHaLUAwWban69aHuFEZLRXr0rYMGqtyE8rzpA1bKkZmMY FcZknd9n7hAFaUlrAnQ75XUMPZ3xegTI6wxPizzabzQoGb094+bLyW1X0tLn/SOAEdw1mBifSdFY QuJoHxDWCkXGirjxlxQSattTbDFC9n+1bcudZVsn4dsGIVxG2NCrqrkebsZoSUIAjwTF7L0sTGNr gwnIJ5YWkrOESsz363TmK8aay/MgXNll7kCvQcr5PRmqVRk3yPyANCcakrVy40fbS1oA85+sLWSy hYPepUqAADU9R8AHAAVulHSOTFkTo/zU2rpucc55uc/UmT9oX5vfwYwFgcDxUOfjD5mYCURCARSw pEeTCcjHqI99B1QuJ4VgQ8GBRf4+HgzZp8IwwXhayWIQ3k2P5+to8C7pKPBeu3fv2AECoTgwWEfy mU+IfItPNRo6k4euZEtdxIleUP/ByEL0jCD/UXJbQiLvISQ5Cey+fh+UeuUN/obAzAzfIhM8C+e/ B+lVnrV2LFjc8qyrY43PgBjDYneNF3rOjbrsKzeCIzzvLl+tgMwKWkJ9jHPho0Fq1hWZgsJnfUwR 2kiQtblZYnDSn2xy2q2JYgaRMQ4hC7oSx/OJ2uvTPJHCASdLkvuwgkvXAM+XRiSHRHpnNXAiWgcp VdLpdMr9LMwkLdpG8yy9YWk3cmj1XAmLr/3ce+ul9JYvvuJiXVJOlFtHKD1ZcPzKRaoeHwajfvVw bcybjh92ZISAhmqUKcFJfcUl/6DsfYhTRC8fBbLawj2xEfWGCf6hnRiR7r4XPFKXg9GvnWVjr5gc Qr1Gl1mQHBMQG/TrdBC7cgCb5aSNMRbmrx6ey9LInw==-----END PGP PUBLIC KEY BLOCK-----
Dear Maintainer, currently the following methods for creating a DEP-5 compliant copyright file: debmake -c cme check dpkg-copyright licensecheck -r --merge-licenses --deb-machine . Thus this wishlist bug can probably be closed. Cheers, Lars
Quoting Lars Kruse (2018-05-22 02:17:49) Let's instead repurpose this bugreport to track making _ideal_ DEP5 output with licensecheck. What we have now is this: licensecheck -r --merge-licenses --deb-machine --lines 0 . The plan is to integrate these scripts: cdbs:/usr/lib/cdbs/license-miner cdbs:/usr/lib/cdbs/licensecheck2dep5 When that's done, we will likely have different opinions on what is ideal - how aggressively (and thus slow) licensecheck should scan, and how the DEP-5 output should be laid out (e.g. if same-licensed files by different authors should be grouped or not). Hopefully we can identify some broad patterns that groups of us can agree on being ideal, and licensecheck can offer such "profiles" under a --usage option. - Jonas
Good morning, Attached please find your PDF account statement and invoice as of 05/11/2023. Please notice you have a past due balance for invoice IN0099203. Please provide payment as soon as possible. Best Regards, Shawneen Chisholm Accounts Receivable Coordinator UNITED RENTALS, INC. Branch L02 BONNYVILLE 4920 56TH AVE BONNYVILLE AB T9N 2N8 CA 780-826-7610 CONFIDENTIALITY NOTICE: The contents of this email message and any attachments are intended solely for the addressee(s). This may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message, please alert the sender immediately by reply email and then delete this message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited
Good morning, Attached please find your PDF account statement and invoice as of 05/11/2023. Please notice you have a past due balance for invoice IN0099203. Please provide payment as soon as possible. Best Regards, Shawneen Chisholm Accounts Receivable Coordinator UNITED RENTALS, INC. Branch L02 BONNYVILLE 4920 56TH AVE BONNYVILLE AB T9N 2N8 CA 780-826-7610 CONFIDENTIALITY NOTICE: The contents of this email message and any attachments are intended solely for the addressee(s). This may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message, please alert the sender immediately by reply email and then delete this message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited