#871806 uscan: (dpkg, git-buildpackage) accept/mangle/store signed git tags in cases where upstream does not publish detached sigs on tarballs #871806
- Package:
- devscripts
- Source:
- devscripts
- Description:
- scripts to make the life of a Debian Package maintainer easier
- Submitter:
- Daniel Kahn Gillmor
- Date:
- 2025-08-23 18:27:01 UTC
- Severity:
- wishlist
Hi devscripts, dpkg, git-buildpackage, pristine-tar folks--
It's awesome to see the progress made on tracking upstream cryptographic
signatures via uscan and dpkg in debian. This provides a dataset for
cryptographic provenance that can be useful for auditing.
We're handling detached OpenPGP signatures for tarballs at the moment,
but not all upstreams provide this particular form of cryptographic
assurance.
Some upstreams do provide cryptographic signatures, but only in git
tags.
I'm not sure exactly how to do this, but what i'd like to see is a way
for us to record and make use of signed git tags in the same way.
I'm opening this bug report in the hopes of starting discussion about
how to best do it.
The text below assumes the following:
* upstream maintains their sources in a git repository.
* upstream either doesn't produce tarballs at all, or they produce
tarballs from their git repository using something like "git
archive". For the sake of the argument here, let's assume that they
don't produce tarballs at all.
* upstream does indicate releases in the form of OpenPGP-signed git
tags.
* The releases offered by upstream correspond to the "upstream
versions" that are uploaded into debian.
Here's an extremely rough and inefficient approach (which i haven't
implemented, as this is in brainstorming phase). I've probably even got
some of the terminology wrong, or the dataflows backward:
* we document how we generate a debian "upstream tarball" from a git
tag. for example, we put this in debian/upstream/vcs-gen-tarball:
git archive --format=tar --prefix=${projname}-${version} ${tagname} | gzip -9n
* make a shallow clone of the git archive at the tag, including the
tag. (i've confirmed that a signed git tag in a shallow repo does
validate correctly).
git clone --bare --depth 1 -b ${tagname} \
file://path/to/upstream/${projname}.git ${projname}-${version}.git
* create an archive of the shallow clone, combined with the command to
generate the tarball (we can call this a "gtsig")
rm -rf ${projname}-${version}.git/hooks
cp debian/upstream/vcs-gen-tarball ./${projname}-${version}.git
tar cz ./${projname}-${version}.git > ./${projname}-${version}.gtsig
* write a simple tool to verify an orig.tar.gz against a signing key
and a gtsig, by extracting the "shallow clone" of the git repository,
verifying git tag -v, using git-archive, and then comparing the
results.
Some of the outstanding concerns:
* what if there is non-DFSG-free data in the upstream git repo? We
want to make sure we avoid shipping it to our mirrors. that's why i
was leaning toward the "shallow clone", but if there are other
techniques, i'd be curious to hear them.
* the .gtsig will be quite large -- roughly the same size as the
orig.tar.gz. Is it possible to make it smaller by just storing the
"delta" needed to recreate the shallow clone from the orig.tar.gz?
Or is it possible (though dirty) to ship the .gtsig itself as the
orig.tar.gz? that smells like trouble, because you couldn't
reconstruct the sources without having git available.
* the .gtsig itself will show "verified" but it could contain some data
that isn't actually covered by the tag. Upon verification, how do we
make sure it's clean? (i note that OpenPGP signature files also have
covert channels where they can carry unsigned material, so this might
not be introducing a new bug in general).
* is "git archive" guaranteed to produce deterministic output?
What do folks think? I'm sure i'm not the first person to think about
this, but i don't know whether there is any existing work done on it
either. Pointers, thoughts, discussion welcome.
Hi, I was just talking about similar things on debian-policy@lists.debian.org and Bug#811565. Also git tag and git HEAD packaging support is currently in progress. This is already a part of uscan. It needs a bit more refinement. The quesion is, we need to make local full clone if git is served dumb http server. (Github is smart :-) Shallow clone or git archive does not work if git server is dumb. Is this simple? Please show me working example as shell log. gtsig is from upstream on git repo. So check it there, I think. (Not on tar.) If we do non-DFSG-free tarball generation, maintainer need to sign the tarball. I am not quite sure about what you mean here. Anyway, let me get the direct support of git archive first. If it works, we worry about signature. If you know particular archive in mind, let us know. Osamu
Hi, I was just talking about similar things on debian-policy@lists.debian.org and Bug#811565. Also git tag and git HEAD packaging support is currently in progress. This is already a part of uscan. It needs a bit more refinement. The quesion is, we need to make local full clone if git is served dumb http server. (Github is smart :-) Shallow clone or git archive does not work if git server is dumb. Is this simple? Please show me working example as shell log. gtsig is from upstream on git repo. So check it there, I think. (Not on tar.) If we do non-DFSG-free tarball generation, maintainer need to sign the tarball. I am not quite sure about what you mean here. Anyway, let me get the direct support of git archive first. If it works, we worry about signature. If you know particular archive in mind, let us know. Osamu
thanks for the pointer! 811565 seems to be about repos without tags,
and i'm interested in repos signed tags, so i think the use cases are
slightly different, though the tools might be the same.
very interesting, though i confess to being a bit sad about not being
able to unpack the sourcecode of a package without having git itself
installed. can you give me pointers to that?
cool, it'd be even better if this was a standardized, set command so
that at most the packager supplied a few constrained parameters and that
was it.
In my experience, if the goal is to create a minimal "shallow clone
snapshot" of the git archive, it's quite often the case that i as the
packager already have a full clone of the upstream repo i'm working on.
So if i'm working in a local copy /home/dkg/src/foo/foo and i've just
seen upstream's tag v3.44, i might do the following to create a "shallow
clone snapshot":
git clone --depth=1 --bare -b v3.44 file:///home/dkg/src/foo/foo foo-3.44.git
This is the equivalent of an orig.tar.gz (i think you can use the above
git archive command against it), but it has the advantage of potentially
also including the signed tag.
I did some experimenting with the "hddemux" package, which is a package
with a very small source tarball, which i'm upstream on, and which is
hosted on collab-maint and on gitlab and on 0xacab.
I made the .gtsig of hddemux 0.3 with:
git clone https://gitlab.com/dkg/hddemux
git clone --depth=1 --bare -b hddemux-0.3 file://$(pwd)/hddemux hddemux-0.3.git
tar cz hddemux-0.3.git > hddemux-0.3.gtsig
rm -rf hddemux hddemux-0.3.git
I was thinking of something like this:
tar xz < hddemux-0.3.gtsig
( cd hddemux-0.3.git && \
git tag -v hddemux-0.3 && \
git archive --format=tar --prefix=hddemux-0.3/ hddemux-0.3 | sha256sum )
rm -rf hddemux-0.3.git/
and testing whether (a) the tag verified, and (b) the the output of
sha256sum matches against:
zcat < hddemux_0.3.orig.tar.gz | sha256sum -
(above, i'm testing the digest of the uncompressed tarball. testing the
compressed version would also be pretty easy)
my goal here is to be able to verify (and store and track) the upstream
maintainer's signatures.
Many upstream maintainers can be convinced to make the HEAD of their
repo DFSG-free -- they understand that some weird jpg some previous
developer committed a few years ago isn't really kosher.
but that doesn't mean that their history is DFSG-free. My concern was
about avoiding shipping non-DFSG-free history, when the tag we want to
ship is itself DFSG-free.
the gtsig contains a snapshot of all the files in the git repo. it's
not a free-floating signature, it's equivalent to a tarball with an
embedded signature and a bunch of dangling pointers to previous git
history.
The attached hddemux-0.3.gtsig is 42000 octets. hddemux_0.3.orig.tar.gz
is 31329 octets. Does that make sense?
Repo with unsigned tag is already somewhat working. The trick to read commit date of HEAD has good overlap with reading signed tag and verifying it against file content hush. Its my local activity ... Yah but tool needs to be general and logic needs to be simple for maintenance. uscan is already very complicated... I see full clone 1st ... You just tar-up everything???? I thought you are going to check signature of tag against content... Does hush given tar exactly the same as one signed in tag ... I guess I need to RTFM. Compression depends on compression parameter which may be different for different arch. Are you sure you tested your thought works??? I am not familiar with this but I thought I use command like the following to check signature. $ git verify-tag ... Can you point me to any resource which indicate your way of verifying tag via tar is the right way... Please educate me on this. Many ... yes, All ... no. uscan is a tool. Why limit its capability? Why send entire history? I feel strange .... At least creating such big file to veriy its current content beat the purpose of making cryptographic signature. When I heard you talking signature, I thought you are talking to create a file with a signature on a tagged contents. But here you are creating something different. To be honest, my knowledge is weak on this. But I am starting to feel you nor me have concrete idea what needs to be done. Osamu
Repo with unsigned tag is already somewhat working. The trick to read commit date of HEAD has good overlap with reading signed tag and verifying it against file content hush. Its my local activity ... Yah but tool needs to be general and logic needs to be simple for maintenance. uscan is already very complicated... I see full clone 1st ... You just tar-up everything???? I thought you are going to check signature of tag against content... Does hush given tar exactly the same as one signed in tag ... I guess I need to RTFM. Compression depends on compression parameter which may be different for different arch. Are you sure you tested your thought works??? I am not familiar with this but I thought I use command like the following to check signature. $ git verify-tag ... Can you point me to any resource which indicate your way of verifying tag via tar is the right way... Please educate me on this. Many ... yes, All ... no. uscan is a tool. Why limit its capability? Why send entire history? I feel strange .... At least creating such big file to veriy its current content beat the purpose of making cryptographic signature. When I heard you talking signature, I thought you are talking to create a file with a signature on a tagged contents. But here you are creating something different. To be honest, my knowledge is weak on this. But I am starting to feel you nor me have concrete idea what needs to be done. Osamu
I think it's totally fair for uscan to implement a feature that only
works for a subset of users, as long as that subset is "people who do
things in a sensible and/or common way".
For instance, this is already the case for the upstream pgpsig feature,
which is excellent. We're handling the case for upstreams who follow
that practices.
I'm proposing that we do something for debian developers with upstreams
who ship signed tags, when those DDs have a full clone of the upstream
git repo. This is both sensible and common :)
yes, that's correct.
sorry, i assumed that we understood that the git tag was already
correctly verified in the process of creating this .gtsig.
You can verify the tag with either "git tag -v" or "git verify-tag",
yes?
Anyway, it's in the verification stage that we really care about
validating the signature on the tag. And i did include the tag
verification ("git tag -v") there:
do you mean s/hush/hash/ ? i don't think i understand the question.
yes, the digest of the uncompressed tarball matched the digest of the
output of "git archive". I did test that with hddemux 0.3 at least :)
i'm not verifying the tag via tar.
i'm verifying the tag, and then checking that the tarball generated from
the tag matches the orig.tar . This is (intended to be) equivalent to
verifying a cryptographic signature over a tarball, since (almost) the
same data is being signed.
Because we need something functional for a common use case? All good
tools are limited, that's what makes them good tools. they're good at
doing a particular thing. I don't understand why you're raising this
question here -- i'm pretty sure you don't want debian to ship
non-dfsg-free data in the archive, right?
That's why i'm proposing basing this on a shallow clone -- so that we
don't need to send the entire history. But to verify a git tag, you
need at least to know the shallow clone, or else you can't verify the
tag.
I agree -- the .gtsig is actually the equivalent to [orig.tar.gz +
orig.tar.gz.asc] (plus it contains a few dangling pointers to the git
history). As constructed above, it's *not* the same thing as the
orig.tar.gz.asc itself.
Maybe a voice/video chat would help to explain this. I think both of us
have at least half the picture. perhaps we could put our ideas together
and come up with something that solves some problems?
Feel free to mail me and suggest a time that would work for you to chat!
Hi! It seems to me like you are perhaps trying to reimplement dpkg source format «3.0 (git)» (described in man dpkg-source)? :) Thanks, Guillem
Hi Guillem-- Thanks for that pointer, it does seem similar. I was hoping that we could produce an actual orig.tar.gz (so that the rest of the tools could use it as they have traditionally) and then some extra thing outside of the orig.tar.gz that, combined with the tarball, could be used to recreate the .git/ repository well enough to be able to (a) recreate the tarball, and (b) cryptographically verify the tag. this would solve my use case (being able to record and ship upstream's cryptographic signatures, when upstream "releases" with signed tags) without requiring the rest of the debian infrastructure to cope with git bundles as "orig.tar.gz-equivalent" blobs. But if there's a plan for "3.0 (git)" to become acceptable in debian, then it does seem like that might be the simplest way to move forward. i'll play around with git-bundle to try to understand it better. from a scan of dpkg-source(1) and the various git manpages that i'm used to reading, i don't understand what .gitshallow or .git/shallow are supposed to do. Does it get shipped alongside the .git? does .git/shallow have meaning for other tools that i should be aware of?
Hi! Right, the problem with this is, as you've found out, that it duplicates the contents, which can be a substantial amount even with a very shallow repo. Ah, sorry if my comment mislead you in that way. I'm not aware of any such plans, in fact adding support for this format got significant push back at the time, and there's even been a removal request (#720598) by its original author. :) The .gitshallow file is an extension to be able to transport the metadata within the source. The .git/shallow seems to be just the metadata to track the state of the shallow clone, but there's not much documentation even in the upstream git.git repo about it. Thanks, Guillem
one of pristine-tar maintainers here. Daniel's ideas made me think a lot about this stuff recently. I've just found https://github.com/cgwalters/git-evtag: it does not solve the problem at hand, but the idea of solving the problem "upstream", i.e., in git, seems reasonable to me. So let's assume that git-archive can produce a reproducible, uncompressed tarball, given a particular githash. Why not ask interested upstream developers to do something like that: git tag -s TAGNAME -m "$(git archive --format tar HEAD | sha512sum)" The tag proves: (1) the history in the git repository, as always (2) but also that a tar generated from this tag should have a particular sha512 hash You can see how this works end-to-end: if we want to take a particular git tag and release it in Debian, we just generate the tarball and extract the associated tag as a crypto-proof. Such tagging may be prohibitive for every commit, though, since it's rather expensive to compute (or not, I just run the above command in a fresh clone of linux kernel source and it took 9s with fs caches, and interestingly the same with caches dropped, weird). But it should be totally fine at least for "release tags". The cool thing is that it could be upstreamed in git, as a flag to git-tag, or at least provided as an extension, such as git-atag (aka git-archive-tag, you get the idea). What do you think? Tomasz
i'm reluctant to have the tag message be a bare sha512 hash (that could
mean just about anything!), but i do like the basic idea. maybe it
needs a bit more cryptographic structure, though.
What about just encouraging developers to store a signature for the
uncompressed tarball as a git note with:
git archive --format tar $TAGNAME | gpg --armor --detach-sign | git notes add -F - $TAGNAME
This is conveniently verified with:
gpg --verify <(git notes show $TAGNAME) <(git archive --format tar $TAGNAME)
I'm not sure how well notes transport across multiple git repos, though,
i haven't tried.
Or, stuff the signature itself in the git tag message while making the
tag in the first place:
(echo "Tagging $PROJECTNAME $TAGNAME" && \
git archive --format tar "$COMMIT" | gpg --armor --detach-sign ) | \
git tag "$TAGNAME" "$COMMIT"
Though i'm not actually sure how to verify that one unless you *also*
sign the tag itself, which starts to get pretty meta. Any suggestions?
or, maybe there's something that could be added to a tag, like an
"archive signature" property? or just a second signature that lives
after the first one? I'm not exactly sure how to do that.
Yes, i like this idea. If there were One Standard Way™ to do it, and it
was just an additional flag to ask people to add to their "git tag"
commands, then it would make it really easy to pull "upstream tarball
signatures" out of projects that don't release tarballs any more, just
git repositories. For folks using gpg-agent in its standard
configuration, it shouldn't cause them any extra hassle either, since
the passphrase for the first signature will be cached and re-used for
the subsequent signature.
Who would you talk to about getting something like that included into
git upstream?
(Sorry for missing In-Reply-To: and References:) Hi all, I was just pointed to this bug, which seems quite similar as #827065. Should these be merged? Also, as I wrote in #827065: I'm highly interested in getting this into uscan, and I would like to take care of this. But: I need a mentor, as my Perl knowledge is quite shitty. Cheers, Georg
Hello Daniel, I just implement a git-tag-signature-verify feature [1] to fix #827065: just to add "pgpmode=gittag" in opts. I think it fixes this issue too. If you agree, I'll merge it. Regards, Xavier
Hi Xavier-- Thanks for this! I finally got around to testing out your changes, and i really like them. I'll be adopting this on all of my packages where upstream prefers signed git tags as a release mechanism. I've opened https://salsa.debian.org/debian/devscripts/merge_requests/82 to clean up the git tag verification a little bit more :) The one thing that's missing to close #871806 is the extraction of a git tag that can be shipped with (and verified against) debian source tarballs, though. We currently do ship .asc files that correspond to signatures over the tarball. Do you see a way that we could ship something that would let a verification happen from just what we ship in debian based on signatures extracted from the git tag?
Le 13/11/2018 à 20:39, Daniel Kahn Gillmor a écrit : Hi Daniel Thanks ;-) Note that if there is a "git origin" that points to upstream repo, uscan will use it instead of creating a temporary "git clone". It is more efficient I think Thanks, I copied old uscan calls since the challenge was non-regression after rewriting. Time to improve now! Unfortunately no. Git signatures are not linked to the archive itself but only to the tag. I don't see any way to link an extracted archive to this sort of .asc Cheers, Xavier
hm, too bad. i use "upstream" as the name of my upstream git repos, since as often as not "origin" (where the thing was originally cloned from) is the debian packaging. :P Maybe i should change my practices there though if everyone else does it differently… happy to help! it made me notice and report #913665 as well, if you're poking around in there :) sorry i didn't have the time to write and test the fixes! right, i'm thinking of some kind of hack similar to a pristine-tar delta file, but which could maybe contain enough of a shallow clone to let the signature verify. i don't know whether that's possible though, i haven't actually thought through the data structures or git's hashing tree. all the best,
Le 14/11/2018 à 00:31, Daniel Kahn Gillmor a écrit : In this case, salsa only looks for url (without looking at the name), so if one points to the same url as the one given in debian/watch, salsa uses it
There seems to be 3 bug reports about checking upstream release tag signatures: Bug#839866 import-orig: please make --upstream-vcs-tag=... verify tag signatures Bug#871806: uscan: (dpkg, git-buildpackage) accept/mangle/store signed git tags in cases where upstream does not publish detached sigs on tarballs Bug#980927 import-ref: Check tag signatures