#975684 jdupes: Building packages with jdupes fails reproducibility tests

Package:
jdupes
Source:
jdupes
Description:
identify and delete or link duplicate files
Submitter:
Carlos Henrique Lima Melara
Date:
2022-03-16 20:36:01 UTC
Severity:
important
#975684#5
Date:
2020-11-25 02:18:49 UTC
From:
To:
Dear Maintainer, hello.

I use jdupes in the build process of a package I maintain [1]. It's an icon
theme and lots of png files are equal so jdupes is used to symlink them. The
package uses salsa CI and previous versions (before October) passed reprotest
(reproducibility test) on salsa [2] and on debian infrastructure [3]. But this
week I started to rearrange some things on the package and noticed that it
was failing salsa CI reprotest [6]. Then I also checked the Debian tests and
a strange thing showed up [3], only one test failed after October.

Snippet of debian/rules:

 override_dh_auto_install:
	dh_auto_install
	# Removes duplicate files using softlinks
	jdupes -lr debian/paper-icon-theme/usr/share/icons

I went the to check both logs. The first from salsa CI indicated that the
.deb were different (I will attach the whole debdiff in this report). As
an example you can see a small part of it bellow:

  Files in second .deb but not in first
  -rw-r--r--  root/root   /usr/share/icons/Paper/16x16/apps/cmake-setup.png
  lrwxrwxrwx  root/root   /usr/share/icons/Paper/16x16/apps/CMakeSetup.png -> cmake-setup.png
  lrwxrwxrwx  root/root   /usr/share/icons/Paper/16x16/apps/cmake.png -> cmake-setup.png
  Files in first .deb but not in second
  -rw-r--r--  root/root   /usr/share/icons/Paper/16x16/apps/cmake.png
  lrwxrwxrwx  root/root   /usr/share/icons/Paper/16x16/apps/CMakeSetup.png -> cmake.png
  lrwxrwxrwx  root/root   /usr/share/icons/Paper/16x16/apps/cmake-setup.png -> cmake.png

The same story is seen on Debian infrastructure test (in the one that failed)
[4].

So I decided to test if jdupes was the source of the problem (this happened
before I had viewed the diffs) creating a branch on salsa to build without
jdupes. It went perfect [5] - but doubled the .deb size :( .

Next step was trying to find some way out of this, I tried the --order
parameter with 'time' and 'name' unsuccessfully.

The release of 1.19.0 in Debian was on 2020-10-13 so I suspect that the
irreproducibility was introduced in this version. Although some Debian
tests passed even with this new version of jdupes [3] I wasn't able to
replicate this behaviour on the salsa CI reprotest test, they all failed.

I'm not exactly sure if reproducibility is a focus of jdupes but it seems
to be a raising concern for the Debian community and I would be very happy
if jdupes were to be reproducible friendly.

Sorry for the long report but it might be useful for other maintainers
using jdupes too and hopefully also for fixing this issue.

Regards,
Charles

[1] https://tracker.debian.org/pkg/paper-icon-theme
[2] https://salsa.debian.org/debian/paper-icon-theme/-/jobs/997903
[3] https://tests.reproducible-builds.org/debian/history/paper-icon-theme.html
[4] https://tests.reproducible-builds.org/debian/rb-pkg/unstable/armhf/diffoscope-results/paper-icon-theme.html
[5] https://salsa.debian.org/debian/paper-icon-theme/-/commit/6c3787ce6d3f7dcedf1c55d4b2518e3421230e44/pipelines?ref=reprod_build
[6] https://salsa.debian.org/debian/paper-icon-theme/-/jobs/1185645

#975684#10
Date:
2020-11-25 17:20:45 UTC
From:
To:
Hi Charles and Jody!

Charles, thanks a lot for your detailed report.

Jody, can you help us with the issue described below? jdupes is very
important for Debian because some packages use it to remove duplicate
files when building .deb binaries.

Thanks a lot in advance.

Cheers,

Eriberto

Em ter., 24 de nov. de 2020 às 23:21, Carlos Henrique Lima Melara
<charlesmelara@outlook.com> escreveu:

#975684#15
Date:
2020-11-25 17:20:45 UTC
From:
To:
Hi Charles and Jody!

Charles, thanks a lot for your detailed report.

Jody, can you help us with the issue described below? jdupes is very
important for Debian because some packages use it to remove duplicate
files when building .deb binaries.

Thanks a lot in advance.

Cheers,

Eriberto

Em ter., 24 de nov. de 2020 às 23:21, Carlos Henrique Lima Melara
<charlesmelara@outlook.com> escreveu:

#975684#24
Date:
2020-11-25 17:46:53 UTC
From:
To:
The files are arriving during recursion in a different order, so the
primary link target is also being chosen in a different order.

What version was in use and working properly before the switch to v1.19.0?

#975684#29
Date:
2020-11-25 17:46:53 UTC
From:
To:
The files are arriving during recursion in a different order, so the
primary link target is also being chosen in a different order.

What version was in use and working properly before the switch to v1.19.0?

#975684#34
Date:
2020-11-25 17:55:47 UTC
From:
To:
Windows but potentially affecting any OS). Redoing the currently
recursive routines will also give me a chance to set up some pre-sorting
to stabilize this behavior and set up some proper unit tests in the
program source.

I've added this to the jdupes GitHub issue tracker here:
https://github.com/jbruchon/jdupes/issues/152

#975684#39
Date:
2020-11-25 17:55:47 UTC
From:
To:
Windows but potentially affecting any OS). Redoing the currently
recursive routines will also give me a chance to set up some pre-sorting
to stabilize this behavior and set up some proper unit tests in the
program source.

I've added this to the jdupes GitHub issue tracker here:
https://github.com/jbruchon/jdupes/issues/152

#975684#44
Date:
2020-11-25 22:23:49 UTC
From:
To:
Hi, Jody and Eriberto.

Thanks for the quick reply.

The tests conducted before 2020-10-13 were using jdupes 1.18.2 (all
tests passed). I only introduced the package in September (after 1.18.2
was released) so I don't know if this was a problem before that release.

Cheers,
Charles

#975684#51
Date:
2022-03-16 20:05:10 UTC
From:
To:
Hi, Jody and Eriberto.

Recently I've received a patch [1] to make paper reproducible, the email
contained relevant information about the problem of reproducibility in
regards to filesystem operations and concerns. Therefore, I'm forwarding
2 emails to you in hope it can be of assistance.

Regards,
Charles

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1005955
----- Forwarded message from Chris Lamb <lamby@debian.org> ----- Date: Thu, 17 Feb 2022 17:47:08 -0800 From: Chris Lamb <lamby@debian.org> To: submit@bugs.debian.org Subject: Bug#1005955: paper-icon-theme: please make the build reproducible X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-4778-g14fba9972e-fm-20220217.001-g14fba997 Source: paper-icon-theme Version: 1.5.0+git20200312.aa3e8af-3 Severity: wishlist Tags: patch User: reproducible-builds@lists.alioth.debian.org Usertags: filesystem X-Debbugs-Cc: reproducible-bugs@lists.alioth.debian.org Hi, Whilst working on the Reproducible Builds effort [0] we noticed that paper-icon-theme could not be built reproducibly. This is caused by jdupes iterating over its arguments using the filesystem ordering, instead of using their filenames. A patch is attached that sorts the input prior to passing it to jdupes (using find, sort and xargs), but it may be more sensible that jdupes does this itself. Indeed, the jdupes manpage implies that it should do this, but I leave this up to your (almost certainly more informed) judgement. [0] https://reproducible-builds.org/ Regards,
--- a/debian/rules 2022-02-17 17:21:20.786897478 -0800 --- b/debian/rules 2022-02-17 17:40:21.992308727 -0800 @@ -7,4 +7,4 @@ override_dh_auto_install: dh_auto_install # Remove duplicate files using softlinks - jdupes -lr debian/paper-icon-theme/usr/share/icons + find -type f -print0 debian/paper-icon-theme/usr/share/icons | sort -z | xargs -0r jdupes -Ol
----- End forwarded message -----
----- Forwarded message from Chris Lamb <lamby@debian.org> ----- Date: Tue, 08 Mar 2022 12:49:02 -0000 From: Chris Lamb <lamby@debian.org> To: Carlos Henrique Lima Melara <charlesmelara@outlook.com> Cc: 1005955-quiet@bugs.debian.org User-Agent: Cyrus-JMAP/3.5.0-alpha0-4778-g14fba9972e-fm-20220217.001-g14fba997 Subject: Re: paper-icon-theme: please make the build reproducible Hi Carlos, ordering issue, it's unsurprising that you could not easily reproduce it locally. Unless you use disorderfs or similar, most file systems will happen to return directory entries in the same order if asked multiple times in a row — the underlying problem is that this is not defined and/or deterministic. Best wishes, ----- End forwarded message -----
#975684#56
Date:
2022-03-16 20:07:48 UTC
From:
To:
I'll get on this issue as soon as possible. It may be next week, but I'll work on it by then.
#975684#61
Date:
2022-03-16 20:31:54 UTC
From:
To:
Em qua., 16 de mar. de 2022 às 17:15, Jody Bruchon
<jody@jodybruchon.com> escreveu:

Thanks a lot Jody. When done, I will test with paper-icon-theme,
before sending it to Debian. If all is right, I think I will reassign
the bug to jdupes package and close the bug.

Cheers,

Eriberto