#1105019 sbuild: source.changes includes binary build info

#1105019#5
Date:
2025-05-10 01:39:25 UTC
From:
To:
I reproduced this on a fresh bookworm install.

Steps to reproduce:

1. configure backports repo
2. install git-buildpackage: apt-get install git-buildpackage
3. install sbuild: apt-get install -t bookworm-backports sbuild
4. create an appropriate sbuild environment:
   mkdir -p ~/.cache/sbuild
   mmdebstrap --verbose --mode=unshare --architecture="$(dpkg --print-architecture)" --variant=apt --hook-dir=/usr/share/mmdebstrap/hooks/maybe-merged-usr bookworm ~/.cache/sbuild/bookworm-$(dpkg --print-architecture).tar.zst /etc/apt/sources.list
   /bin/echo -e '$chroot_mode="unshare";\n$clean_source=0;\n1;' > ~/.sbuildrc
4. clone an arbitrary package from Salsa: git clone https://salsa.debian.org/debian/krb5
5. Execute the build:
   cd krb5
   git branch upstream origin/upstream
   git branch pristine-tar origin/pristine-tar
   git checkout -b bookworm origin/bookworm
   gbp buildpackage --git-builder='sbuild -d bookworm --no-run-lintian --source-only-changes' --git-debian-branch=bookworm
6. Observe the presence of the buildinfo in the resulting source.changes:
   grep buildinfo ../*_source.changes


- -- System Information:
Debian Release: 12.10
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.1.0-34-amd64 (SMP w/16 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages sbuild depends on:
ii  adduser         3.134
ii  libsbuild-perl  0.88.4~bpo12+2
ii  perl            5.36.0-7+deb12u2

Versions of packages sbuild recommends:
pn  autopkgtest  <none>
ii  debootstrap  1.0.128+nmu2+deb12u2
ii  iproute2     6.1.0-3
ii  mmdebstrap   1.3.5-7
ii  schroot      1.6.13-3+b2
ii  uidmap       1:4.13+dfsg1-1+b1

Versions of packages sbuild suggests:
ii  e2fsprogs  1.47.0-2
ii  kmod       30+20221128-1
ii  wget       1.21.3-1+deb12u1

- -- no debconf information
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEIYZ1DR4ae5UL01q7ldFmTdL1kUIFAmgerksACgkQldFmTdL1
kUIPmA/9GlNgmF0YDkQ6s2grgy14eDVkZOkZlEyMES6RMNIC7Q6QgrXPdG5TV4Nl
0s8lULnBV35r8egGLM8H0D3naVstkwswOCS3zkofPXZiqxUTTGwT7ztTVX1wDyXE
FFl8vLHfeyX1Xhpa7qKI/tEpmdcY1C08cNE4UtRSGGPclAGG9ykisVsrT73zyz4u
JgnDQ69f0ek14/J4t1k9lRL+9NwI04AlB+GR78qXYzf2C/fibeaVAnHKROKpqg5o
pOjHPtfvCpOfNoeHv/DsWTDvLrx04EL1oYDel7+iNcdSC8jR2APqqzRse6RnyPbn
Cz6024uKC7cJF5oCK0tZEOWtlBdUB4ulyrBOVcVIssM6iFtg6+DYhQcbPoDPARSV
ukpjWjrzLa59NZkXRUCj0YTn8iKN0gu7TMsPQ/+EZT6at8w92uzkFNgneQ3vTdsM
A1RK/uM5RNQWA0UA7HEMgp2dNjbRm+EEb4YVDmMQilFY5F6iMlG+I04KL7MjpecA
HjN5fJxKLO2j5fgZ7m/SGhFyE7bRb5s/vGBzT+HXmlJ5Yr5B793iACuI3tHvjmY1
3VpUeVUb3Xcc5BOGWiM4rq6lvhLFBDeMtKV0iIh8q9AAQn0pGdnAbArTGckUJCWK
XUeOhXr1WEEykWTCXtObp8fWvSXS6ydrmjkSy82TWQMkoaLmRpQ=
=qIBi
-----END PGP SIGNATURE-----

#1105019#8
Date:
2025-05-10 07:29:41 UTC
From:
To:
Hi Roberto,

Quoting Roberto C. Sanchez (2025-05-10 03:39:25)

thank you for your very detailed bug report. It contains everything I wanted to
know and I appreciate that you performed the steps from a fresh install. I can
verify your findings from within a vanilla debvm of Bookworm as well.

I'll leave some remarks in your reproducer instructions but they do not change
the issue you report.

Note, that with sbuild from backports, you do no longer need to manually create
the build chroot. If you did not create one, then sbuild will create one for
you automatically, so the last step above is superfluous.

Some even smaller remarks:

 - you don't need to pass "dpkg --print-architecture" because that's the
   default for --architecture anyways
 - why do you create a --variant=apt chroot as a buildd chroot? It will work
   but you might want to create a --variant=buildd chroot so that you do not
   install all the build-essential packages over and over again
 - you can drop maybe-merged-usr with trixie
 - there is probably no gain in compressing your tarball -- it just wastes
   CPU every time you create it and every time you unpack it

You can avoid tracking each of the branches by using:

 debcheckout --git-track='*' krb5

Another great replacement for a raw 'git clone' is:

 gbp clone https://salsa.debian.org/debian/krb5

Which can also have --add-upstream-vcs but none of this is relevant here,
because:

You can reproduce the bug even without gbp and without downloading the source
first. Here is a smaller reproducer:

sbuild -d bookworm --no-run-lintian --source-only-changes --extra-repository "deb-src http://deb.debian.org/debian bookworm main" hello

This is something that dpkg-genchanges does. Sbuild runs this:

    dpkg-genchanges --build=source

And that will include a reference to the buildinfo if before running this
command, the package was already built. If you run "debian/rules clean" before
running above command, then the resulting .changes file will *not* contain a
reference to the buildinfo.

If you think that this is a bug, you should re-assign this to dpkg.

Thanks!

cheers, josch

#1105019#13
Date:
2025-05-10 12:22:00 UTC
From:
To:
Hi Roberto (2025.05.09_21:39:25_-0400)

The issue that triggered this
(https://salsa.debian.org/freexian-team/debusine/-/issues/884) was the
presense of a *binay* buildinfo (not named *_source.buildinfo) in the
source upload.

Maybe this is just from the early days, before it got that name?

Stefano

#1105019#18
Date:
2025-05-12 08:38:08 UTC
From:
To:
Control: reassign -1 dpkg-dev
Control: found -1 1.21.22

Thanks, doing so as I ran your reproducer and I confirm that the binary
buildinfo is there.

$ grep buildinfo hello_2.10-3_source.changes
 807907dc2f97b2a089e6b05e697eaf157f57767b 5813 hello_2.10-3_amd64.buildinfo
 15285c44b8509fb01454f419171170f9e6468c866df5e59e65036bc9cd35062c 5813 hello_2.10-3_amd64.buildinfo
 7fd5124a2508e44589040f769a152da1 5813 devel optional hello_2.10-3_amd64.buildinfo

Guillem, the issue reported here is that running "dpkg-genchanges
--build=source" in a freshly built tree will include the _<arch>.buildinfo
from the source+binary build run just before, whereas a source-only build
will properly generate a .changes that references a _source.buildinfo.

This introduces a difference in what's generated between users of "sbuild --source-only-changes" and "sbuild --source --no-arch-any --no-arch-all" (or plain dpkg-buildpackage -S).

Cheers,

#1105019#29
Date:
2025-05-13 00:28:57 UTC
From:
To:
Hi!

With the .buildinfo support introduction, one current requirement is
that any .changes file includes at least one .buildinfo file (so
there's currently no filtering based on build type, nor any counting
to not break on potentially old tooling).

From the grep above, I'm assuming there's no reference to a
*_source.buildinfo file in debian/files, so that means one will not
get included in the generated .changes file (as I think would be
expected if one performs a source-only or source+binary build?).

I guess what I see here is potentially a problem with how the build is
being driven by sbuild or whatever else is driving it, where there's
at least a missing source build phase (dpkg-buildpackage --build=source),
or parts of it (dpkg-genbuildinfo --build=source).

But then it could be argued that there's potentially a problem with
dpkg-genchanges where it should track how many .buildinfo files are being
distributed, and probably warn (or error out?) if none are, and then if
it has seen a .buildinfo matching the current --build mode, then ignore
other .buildinfo files not matching it. Although this would break
source-only uploads performed as full builds, which was added explicitly
to support that use case when source-only uploads support got added in
Debian. For example I routinely prepare all my uploads with:

  $ dpkg-buildpackage --changes-option=-S

Because I want the artifacts I built to be recorded as part of the
.buildinfo. But if what you really want is a pure source-only upload,
then I think that's what you should be asking your build driver, the
equivalent of:

  $ dpkg-buildpackage --build=full
  $ dpkg-buildpackage --build=source

So adding such filtering would break that use case above. I'd need to
think how to support such filtering, but right now I'm not seeing it.

(Unless I've completely misunderstood the bug report, as I don't think
I've ever really used sbuild, where my main interactions with it are
via code reading and to support it as part of dpkg. :)

Thanks,
Guillem

#1105019#34
Date:
2025-05-13 06:33:49 UTC
From:
To:
can't we change this requirement? .buildinfo files for _source.changes
don't make sense, so we shouldn't create nor distribute them.

#1105019#39
Date:
2025-05-13 10:02:54 UTC
From:
To:
Hi!

(I think we have discussed this in the past. :)

If someone uses dpkg-buildpackage, then build dependencies need to be
satisfied (even for source-only builds), where code gets executed from
the package itself (clean targets etc), so this is also a build that
generates an upload represented in a .changes file. Those can also
affect source package generation, so I still think it does make sense
that they generate a .buildinfo file. I also think reproducible source
packages are an important thing that we already have (at least tooling
wise), which I'd rather not regress support on.

Thanks,
Guillem

#1105019#44
Date:
2025-05-13 11:10:25 UTC
From:
To:
hi Guillem,

indeed! :)

point taken. (not sure whether I previously that it this way, thus I'd rather
say so now.)

actually we don't have reproducible source packages and last time we looked
(which argueingly is 10 years ago) it didnt seem feasible *and* we didn't
see a compelling reason to have them either.

why do you think they are important?

#1105019#49
Date:
2025-05-13 12:24:38 UTC
From:
To:
Hi!

We have had reproducible source packages (barring OpenPGP signatures in
the .dsc files) since pretty much the same time dpkg-deb gained support
for reproducible binary packages. See these commits I found (I don't
recall whether there's been need for anything else more recently):

https://git.dpkg.org/cgit/dpkg/dpkg.git/commit/?id=d959233560317459336d39197f515c2042472762
https://git.dpkg.org/cgit/dpkg/dpkg.git/commit/?id=66a12fb8b22f13bb89dd59bf13db2fb939d3de87
https://git.dpkg.org/cgit/dpkg/dpkg.git/commit/?id=6c32c76ba20b641e14fc1533cecb3ca674850a90

For QA alone this seems important (test suites for example), but in a
security context, to me this seems like a rather important part TBH,
the foundation on which binary package reproducibility is sitting. More
so in scenarios such as the xz attack for example. Reviewing diffoscope
differences is very helpful, but in the end we need to review and modify
the sources, from which the binaries get derived. :)

Thanks,
Guillem

#1105019#54
Date:
2025-05-13 12:58:30 UTC
From:
To:
have you actually tried that?

obviously I agree that being able to reproduce the content would be nice,
however in our tests years ago, not even that was possible, yet alone
bit by bit (thus including timestamps).

I guess someone would need to actually investigate some hundred packages
today, to see how things are really today.

#1105019#59
Date:
2025-05-14 08:56:41 UTC
From:
To:
Hi!

Sure, I'd like to assume at the time this got implemented :), and also
as part of every dpkg release:

https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/build-aux/gen-release#n147

Also ISTM that reproducibility of source packages is easier to proof
(at least from the toolchain PoV), than for binary packages, because
most of the generation is driven by the toolchain itself (as seen from
the commits I referenced in dpkg). The only variable and/or potentially
problematic part is the «debian/rules clean» and whether it has side
effects that could affect that generation.

A current test could be something like:

  ,---
  $ apt source dpkg
  $ sq verify --cleartext dpkg_1.22.18.dsc | head -n-1 > dpkg-orig.dsc
  $ cd dpkg-1.22.18
  $ dpkg-buildpackage -us -uc -S
  $ cd ..
  $ diff -u dpkg-orig.dsc dpkg_1.22.18.dsc && echo reproduced source
  reproduced source
  `---

If you recall the specifics, I'd be curious to hear them!

Perhaps my statements were sloppy though. When I said reproducible, I
meant that the toolchain can produce them, assuming the source package
itself does not get in the way via «debian/rules clean». I didn't mean
we have 100% coverage on the Debian archive for example, where as you
point out we (well someone :) would need to practically check whether
that's the case. My assumption is that most would do, but I think it's
realistic to expect that we might find a number of packages were
«debian/rules clean» affects the source generation.

I think whether we can reproduce the same source after a full build
(so the equivalent of a twice in a row build) might perhaps be more
challenging (and I'd expect less reproducibility there), but for a
single download source + full build, we are only concerned about the
«clean» target, as the source generation is performed as the first
thing.

OTOH, I think the current reproducible infra has probably all the
data, and it might just be a matter of checking whether the unsigned
*.dsc (from build-a and build-b) match? :)

Thanks,
Guillem

#1105019#64
Date:
2025-05-16 12:00:37 UTC
From:
To:
hi,

oh nice!

I've just checked devscripts and developers-reference, and much to my
surprise their source packages indeed built bit by bit identical:

$ diffoscope p1/developers-reference_13.19_source.changes p2/developers-reference_13.19_source.changes
--- p1/developers-reference_13.19_source.changes
+++ p2/developers-reference_13.19_source.changes
├── Files
│ @@ -1,4 +1,4 @@
│
│   6c2a48c479ecd9d4710b64549f8ef44a 1644 doc optional developers-reference_13.19.dsc
│   283e1516834500ab48daf62c74714af2 575920 doc optional developers-reference_13.19.tar.xz
│ - 3afde36f59e56164068ad521f11bc60a 6057 doc optional developers-reference_13.19_source.buildinfo
│ + e3d438ba597ef522c68b9a730a7b32d4 6057 doc optional developers-reference_13.19_source.buildinfo
├── developers-reference_13.19_source.buildinfo
│ ├── Build-Date
│ │ @@ -1 +1 @@
│ │ -Fri, 16 May 2025 11:54:47 +0000
│ │ +Fri, 16 May 2025 11:55:12 +0000

yes, me too, but that's not how source packages are build for real. :)

indeed

yes, patches welcome! (I have more then enough on my plates, so I doubt
I'll dive into *this* rabbit hole in this decade. If you are interested
to do that on the r-b infra I'll be happy to help.)

#1105019#71
Date:
2025-07-01 17:42:52 UTC
From:
To:
Hi Roberto and others,

It seems that the bug report consensus is that you see the inclusion of
the .buildinfo file in the _source.changes as a bug.  What you see as a
bug, I see as a feature. I may perform a source-only upload and still
convey that I reproducibly built the package.

Can you or someone else elaborate on why you see the inclusion of the
.buildinfo file as a problem?

If there was some functional change removing the .buildinfo from a
source upload, I'd be inclined to report the reverse bug in the absence
of such reasoning.

On the flip side, the inclusion of the buildinfo does pose downstream
problems. We may now be faced with multiple .buildinfo files with the
same filename and different content. As such, we may not just store them
next to each other, but that's how mergechanges and reprepro operate
leading to problems in their use and Debusine also encountered the issue
with mergechanges on its own.

Originally, reproducible builds would include a granular timestamp in
the filename and you may see evidence for that in e.g.
https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles#Example. That
timestamp was replace with package_version_architecture in the dpkg
implementation. Depending on why others see the inclusion of a binary
.buildinfo as a problem, maybe using an unreproducible name would solve
some of those problems?

I note that we may influence the default filename:

dpkg-genbuildinfo -O../othername.buildinfo
dpkg-buildpackage --buildinfo-file=../othername.buildinfo
sbuild --debbuildopt=--buildinfo-file=../othername.buildinfo

In principle, sbuild could decide to choose a default that differs from
dpkg's. Not sure we want that.

In asked Guillem for references to the choice in filenames and he kindly
provided
 * https://lists.debian.org/debian-dpkg/2016/11/msg00056.html
   Guillem argues that it would be difficult discovering the right
   buildinfo file and that there could be several of them from earlier
   builds.
 * https://git.dpkg.org/cgit/dpkg/dpkg.git/commit/?id=d5005e4576bcf9b341e83cfb8647d5f96438642f
   The commit argues that there is no point in using unique filename.

We now do have reasons to choose unique buildinfo filenames (i.e. there
exist tools that assume them to be unique).

For now, I'd like to understand why the inclusion of the buildinfo is
seen as a problem and how others see the idea of changing the buildinfo
filename as a means to avoid collisions.

Helmut