#839925 metaphlan2-data: postinst deletes shipped file: /var/lib/metaphlan2-data/markers.fasta #839925
- Package:
- metaphlan2-data
- Source:
- metaphlan2-data
- Submitter:
- Andreas Beckmann
- Date:
- 2024-02-07 17:15:07 UTC
- Severity:
- important
- Tags:
Hi, during a test with piuparts I noticed your package removes files that it has shipped. 28m45.5s ERROR: FAIL: debsums reports modifications inside the chroot: debsums: missing file /var/lib/metaphlan2-data/markers.fasta (from metaphlan2-data package) (If I run it manually and don't generate the stuff in postinst, the file stays installed). A gut feeling says that I would rather expect the shipped file in /usr/share and the generated files in /var/lib ... Wrote 304203219 bytes to primary EBWT file: /usr/share/metaphlan2/db_v20/mpa_v20_m200.rev.1.bt2 Wrote 177889404 bytes to secondary EBWT file: /usr/share/metaphlan2/db_v20/mpa_v20_m200.rev.2.bt2 cheers, Andreas
Hi Andreas,
I need to admit that it is intended to remove the file from users hard
disk since its only reason is to create the resulting files and will not
be needed afterwards any more. Upstream actually ships the results and
to save bandwidth the smaller (and editable text) format fasta is used
for the Debian package. This compromise was discussed on debian-devel.
It was not discussed whether it is OK to remove the intermediate format
afterwards. Could you imagine a solution which does not bloat users
harddisk with unused files that does not raise a signal on Debian's QA
tools?
Kind regards
Andreas.
That's an interesting usecase. Feel free to downgrade the severity. Guillem, do you have any suggestions how to solve this? In an abstract view the package uses a custom compression format and custom decompressor for (some of) the files it ships. Andreas
Hi! <https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Can_dpkg_handle_volatile_files.3F>. Having played a bit with the generated files, they do not seem to compress very well, so I don't see any other option. This is in the end a matter of a trade-off, between downloaded data and computation time on each and every system. Personally I'd favor a bigger file and less time spent on every and each installed system, because the data will end up occupying that much space on disk anyway, it's not something downloaded often (I'd assume), and being an arch:all is shared for all arches. If you are going to still favor the rebuilding at install-time, a couple of possibly slight improvement might be to exclude the removed files from the md5sums files generated at package build time, but this will probably still trigger QA tooling alarms. And try to get a more accurate package installed size by setting the Extra-Size substvar to compensate for the difference (man deb-substvars). Thanks, Guillem
Set wontfix since here is no real solution how to solve this bug. Thus cleaning up list of bugs a bit.