#887831 jigdo-file: Jigdo .template file and resulting ISO are only verified by MD5

Package:
jigdo-file
Source:
jigdo
Description:
Download Debian CD/DVD/USB images from any Debian mirror
Submitter:
"Thomas Schmitt"
Date:
2019-11-11 18:27:05 UTC
Severity:
normal
#887831#5
Date:
2018-01-20 12:06:06 UTC
From:
To:
Dear Maintainer,

as described in
https://lists.debian.org/debian-cd/2018/01/msg00021.html
jigdo-file verifies the .template file and the resulting ISO image only
by MD5 checksums which stem from the .jigdo file.

This is good enough for recognizing the packages which shall be grafted
into the emerging ISO and for detecting transport errors. But as soon as
the .jigdo files are verificable by the *SUMS and *SUMS.sign, the MD5s
will be a weak part of the verification chain.

Three softwares from 2 packages are involved:

- jigdo-lite from package jigdo-file verifies the .template file by line
  "Template-MD5Sum" from the .jigdo file. The MD5 computation for the
  downloaded .template file is done by program jigdo-file.

- jigdo-file command "verify" reads the ISO image MD5 from the end of the
  .template file and compares it with the MD5 calculated from the image file.
  There are better image chwcksums in the .jigdo file.

- libjte from package jigit produces the .jigdo and .template files for
  most Debian ISOs. It could well compute better checksums for .template
  and put them into the .jigdo file.
  Changing the .template format seems more tricky. I am not aware of any
  description of its format. It would have to be deduced from the code of
  jigdo-file or libjte.


Have a nice day :)

Thomas

#887831#10
Date:
2019-10-22 16:39:47 UTC
From:
To:
Apparently, the MD5 sums are only kept in the packages list because of
jigdo.  the sha256 digests are what any modern package management system
should be using.

Furthermore, we could shave off about 13% of the size of the compressed
Packages file if we removed the MD5sum lines:

$ grep -v ^MD5sum < /var/lib/apt/lists/ftp.debian.org_debian_dists_sid_main_binary-amd64_Packages | gzip -9 | wc -c
9541056
0 dkg@alice:~$ cat < /var/lib/apt/lists/ftp.debian.org_debian_dists_sid_main_binary-amd64_Packages | gzip -9 | wc -c
10913735
0 dkg@alice:~$ echo $(( 100 - 100 * 9541056 / 10913735 ))
13
0 dkg@alice:~$

This is not an insignificant amount of bandwidth for every debian user.

I would even posit that temporarily breaking jigdo would be better than
keeping this additional bandwidth cost in play.

#887831#15
Date:
2019-10-22 17:15:51 UTC
From:
To:
Hi,

Daniel Kahn Gillmor wrote:

To my knowledge, jigdo is the only way to get full DVD sets or any BD sized
installation ISO from
https://cdimage.debian.org/cdimage/release/current/amd64/

bt-* seems t have what iso-* has. Biggest is the 5.3 GB
debian-edu-10.1.0-amd64-BD-1.iso which would fit on a DVD+R DL, too.
Only the first three DVDs are offered by iso-* and bt-*.


Have a nice day :)

Thomas

#887831#20
Date:
2019-10-22 19:14:41 UTC
From:
To:
I understand what you're saying (though i haven't used optical media to
install debian in over a decade).

If jigdo would use the SHA256sum entries instead of the MD5 entries when
it is doing ISO assembly, then everyone could still fetch full DVD sets
or BD sized installation ISOs, without every single debian user
expending the extra 13% bandwidth on fetching the Packages file when
they "apt get update".

AFAICT, jigdo's last maintenance release (debian version) was nearly two
years ago.

The last upsteam release (0.7.3) was produced in 2006.

Its homepage is http://atterer.org/jigdo/devel.html , which was last
updated in 2010, and points to a CVS server i cannot access.

Its debian packaging doesn't have a Vcs-* field (though i've just
created https://salsa.debian.org/debian/jigdo and uploaded what i could
get from "gbp import-dscs --pristine-tar --debsnap" to it).

This looks like it might be a moribund software project, and it should
not be holding back the rest of the project from making other
improvements.  Making the Packages file smaller is a concrete advantage
for every debian user.

Do you have any suggestions to offer to make jigdo work using a modern
cryptographic digest?

#887831#25
Date:
2019-10-22 19:29:54 UTC
From:
To:
Control: severity 887831 grave

Yes. In fact, right now, I can't think of any use case for Jigdo. It's
been totally superseded by bittorrent, which is standardized, widely
available and much more popular, with multiple client implementations.

Fedora stopped shipping their releases with Jigdo in 2011, according to
wikipedia:

https://en.wikipedia.org/wiki/Jigdo

WP also says development has stopped since 2006.

If the ISO image generation is broken, it should be fixed. I don't see
why we should depend on jigdo for anything anymore.

In the meantime, I think it's perfectly acceptable to remove MD5sums
from the archive, at the cost of breaking jigdo.

Or, to put it another way, it's completely unacceptable that jigdo uses
MD5 to authenticate checksums, and if it keeps doing so, we shouldn't
ship Debian with it. That is a release-critical bug, severity "grave"
with justification "introduces a security hole allowing access to the
accounts of users who use the package".

A.

#887831#32
Date:
2019-10-22 20:37:58 UTC
From:
To:
Hi,

Daniel Kahn Gillmor wrote:

I am kindof the second-last jigdo export, but not at all with .deb
entrails.
Are you sure that Debian package management is involved other than
maybe with generating the input file for xorrisofs option -md5-list ?

In the .jigdo file, which controls the package download operations
of jigdo-lite, the MD5 is a key which connects the package file path
with a matchable descriptor entry in the .template file bearing the
same MD5.
A gunzipped .jigdo file bears for example
  FexKzYyIVG2rRb1UjUKj8Q=Debian:pool/contrib/b/biomaj-watcher/biomaj-watcher_1.2.2-4_all.deb
which is the MD5 as base64, "=Debian:" representing the individual part
of the mirror URL chosen at download time, and "pool/.../...deb" to depict
the invariant package path part on the mirror server.
The matching descriptor entry in .template bears the same MD5 and by
its position marks the place where to patch the .deb file into the .iso.

Maybe Steve McIntyre can say more about how the -md5-list file gets created
before xorrisofs is run.

Steve seems busy with other stuff.

This one is dead. At that time, the .jigdo and .template files were
generated from existing .iso images by matching the submitted MD5 list
against block sequences in the ISO.
Steve then taught genisoimage how to produce .jigdo and .template on
the fly while producing the .iso image.
Before xorriso could take over the job, George Danchev and i extracted
Steve's jigdo code into a library named libjte which is then used by
xorriso to produce the desired companion files of the .iso.

For restoring .iso from jigdo, only jigdo-lite from package jigdo-file
is left. Because there is no supported tool for Mac or MS-Windows,
i began to describe a jigdo download procedure via a Debian Live ISO:
https://wiki.debian.org/JigdoOnLive

Main open questions are about how to get a Debian Live connected to the
internet if there is non-free firmware needed, and how to access the
foreign OS'es filesystems for writing the .jigdo, .template, and .iso
files. (I am neither sysadmin nor MS/Mac user.)

We would have to team up with Steve to fix the remaining moderate
security concerns about the jigdo download process.

There are no security concerns about the matching of .template block
ranges with package paths, because no man-in-the-middle can alter
this mapping, once .jigdo and .template files are verified.
MD5 with its 128 bits should be very safe against false identifications
if the file count in a .jigdo file stays well below 2 exp 30.

The resolution of bug #887830 fixed the most dangerous security gap of
using a totally untrusted .jigdo file and a then only MD5-checked
.template file. A cautious user can now verify both files before running
jigdo-lite. (jigdo-lite will not download again if it finds the files
already in the current work directory.)

This bug here, #887831, only tries to bring the internal checks of
jigdo-lite on the downloaded .template and resulting .iso to the security
standard which is recommended but not enforced for download of .jigdo
or direct download of .iso.

Steve once announced to publish a straightforward instruction of the
verification steps from SHA512SUMS.sign, to SHA512SUMS and then to
possibly .jigdo and always .iso.
I hope he still knows where the draft for this is ... :))


Have a nice day :)

Thomas

#887831#37
Date:
2019-10-22 21:03:49 UTC
From:
To:
Hi,

Antoine Beaupré wrote:

This is really overdone.

See jigdo as a peculiar way of downloading the ISO with a MD5 check
where e.g. wget has none at all.
And as said, for now jigdo seems indispensible for the fat ISO sets.

My bug report does not say that ISO production is broken or that jigdo
is the reason for any of the checksums in the package management.
I doubt both theories.

I agree to this plan, if you afterwards verify that debian-cd still can
produce a pair of .jigdo and .template which jigdo-lite then can use
to create the identical ISO by help of a package mirror.

I place my bet on no problems, but i may be wrong.

It does so for cross-table key matching, where MD5 suffices by all means
of hash table theory.

It does so for verifying internally what can be verified externally by
the best means which Debian offers for its ISOs. I advise to do the
external check of .jigdo and .template before the run of jigdo-lite and
the external check of .iso afterwards.

There is bug #887837 where i propose to add a reminder message at the end
of the jigdo-lite run.

Debian could really need a end-user comprehensable description of the
credible verification from GPG to SHA512 to ISO. This is completely
independent of jigdo and applies to all download methods for ISOs.


Have a nice day :)

Thomas

#887831#42
Date:
2019-10-22 21:10:19 UTC
From:
To:
Control: severity -1 normal
That is entirely unjustified afaict, and doesn't help your case.

Cheers,
Julien

#887831#49
Date:
2019-10-22 21:37:02 UTC
From:
To:
I mean "broken" as in "DVD as are not available as normal ISOs over HTTP
or bittorrent but only jigdo".

jigdo is the reason MD5sums are still in the Packages files, according
to ftp-masters. It's not a theory I just came up with just for kicks.

In my experience, jigdo never worked, so I don't expect I will be able
to do this after the removal, nor before. I have long given up on doing
anything with jigdo.

[...]

To quote wikipedia:

I would consider using MD5 in any software a serious engineering
mistake, in any case. It might still be useful as a hash table
component, but I would suspect it is still a mistake.

[...]

It's really unfortunate that this bug has been downgraded. I was hoping
to take this as an opportunity to remove jigdo from our workflows, but I
guess we will need to tackle this (namely that jigdo is completely
abandoned and broken) head on, in separate bug reports.

The problem is *every* bug report (e.g. #772110) that tries to document
serious issues about jigdo *all* get downgraded to "normal", saying
"this is not a problem".

I think this is an unfair way to treat your users. Sure, it will keep
jigdo in Debian forever, but it will give a false sense of security (in
case of this bug) and reliability (in the case of #772110), which will
hurt Debian's adoption.

Is anyone still seriously thinking that jigdo is a reliable and useful
way to download Debian nowadays?

A.

#887831#54
Date:
2019-10-22 21:52:07 UTC
From:
To:
I'm unaware of the meaning of "cross-table key matching", but it's known
to be relatively easy to find collisions in MD5.

If the adversary can convince any DD to upload an obviously harmless
package of the adversary's choice into the archive, then the adversary
can also craft another package with a matching MD5sum.

As a DD, i don't want my signature authorizing a specific upload to be
used to distribute some other file to our users.

You probably don't mean it this way, but this sounds it will make what
should be the software author's problem into the user's problem.  i
think we should be more user-friendly than that.  I've followed up on
that bug report separately.

I agree that this kind of separate documentation would be great to have,
and is independent of this bug report.

Thanks for your attention to (and efforts on behalf of) jigdo,

#887831#59
Date:
2019-10-22 23:14:37 UTC
From:
To:
just said "entirely unjustified" without any sort of explanation. Not
sure I understand what case you're bringing forward.

I guess I should be thankful: without your intervention, I might have
considered continuing to monitor that bug report and maybe get involved
in trying to fix that problem in the long run, across the project. But
now you're destroyed the last shred of motivation I had to ever work
again on jigdo.

So thanks, in a bizarre, unfortunately-so-typical-in-Debian, way.

A.

#887831#64
Date:
2019-10-22 23:28:42 UTC
From:
To:
Hi,

Antoine Beaupré wrote:

Afaik, that's because jigdo uses the package mirror servers as source of
the bulk of its downloads. Those servers experience the same as with a
generously sized installation of a Debian system.

I really wonder where this dependency should come from. Is there any
public background info to motivate this claim ?

... one only needs to wish:
Ansgar <ansgar@43-1.org> writes in 942893@bugs.debian.org and on
debian-cd@lists.debian.org:

So it's about lazyness of debian-cd. That's an implementation detail.
One should probably talk to Steve McIntyre directly.

Why that ? debian-cd has copies of the .deb files to pack them up in the
ISO. There is no jigdo production without seeing the packages.
md5sum is not really slow, nowadays.

It is used as opaque identifier, but this identifier is at some occasions
also interpreted as MD5 which has to match the package file.


Antoine Beaupré wrote:

I tested the whole procedure shown in
https://wiki.debian.org/JigdoOnLive
with the non-standard situation to have an old Desktop with the filesystem
of an inconscious Debian 6 host of the Debian Live system.

Since i assume that you have a Debian system running, be invited to
join in at

https://wiki.debian.org/JigdoOnLive#If_needed.2C_work_around_a_shortcoming_of_older_jigdo-lite

in order to download e.g.
https://cdimage.debian.org/cdimage/release/current/amd64/jigdo-dvd/debian-10.1.0-amd64-DVD-4.jigdo
https://cdimage.debian.org/cdimage/release/current/amd64/jigdo-dvd/debian-10.1.0-amd64-DVD-4.template

Verify them properly and run jigdo-lite with URL
https://cdimage.debian.org/cdimage/release/current/amd64/jigdo-dvd/debian-10.1.0-amd64-DVD-4.jigdo

Finally verify the emerged debian-10.1.0-amd64-DVD-4.iso by the proposed
means.

Yes. But jigdo-lite does not use MD5 for cryptographical purposes except
for an initial check of .template and a final check of .iso.
Both are outdated. But both get re-assured by GPG and SHA512 if you
follow the procedure which is outlined above.

But the diagnosis by Steve McIntyre tells why:

jigdo-lite even noticed the problem before the overall check of the .iso
could be run to finally report a damaged ISO, too.

Now what would you do if a directly downloaded ISO turns out to be
damaged ? I'd advise download from a different source. In case of jigdo
this is mainly a different package mirror.


Daniel Kahn Gillmor wrote:

You have one table with keys and values and another table with keys and
values. Lines in both tables, where the keys are the same, are considered
to be in relation. Like in a data base.

It is suspected that it is possible to construct byte strings which
produce a desired particular MD5 value.
But when matching the lines of two tables by a key you have to consider
hash table theory and especially the birthday paradox. A short excursion
on wikipedia lets me estimate that the chance for a MD5 collision among
1 billion .deb files is about 1 - e exp -1e-20.
Regrettably i found no calculator which would not say 0 as result.

The cryptographic check is to be done on .jigdo and .template before
the run of jigdo-lite, and on .iso afterwards.

What is a security risk is that you need much contact with debian-cd and
its environment in order to know how to do it. The info is scattered on
the Debian web appearance.

Steve. You should now face your critics. I did what i could as lowly user
of Debian and disorganized upstream of xorriso.


Have a nice day :)

Thomas

#887831#69
Date:
2019-10-23 02:07:38 UTC
From:
To:
I think you're describing a preimage attack.  I was talking about a
collision attack, which is significantly easier to perform than a
preimage attack.  For MD5, it is not "suspected" to be a problem, it is
closer to "can be done in less than a second on scavenged hardware".

For more details, see https://eprint.iacr.org/2013/170.pdf

Sorry, i'm not following this argument.  We're not talking about random
chance -- we're talking about adversarial attack.

If "cryptographic check" means "OpenPGP signature verification" then i
agree that MD5 isn't relevant here.  But i don't think that jigdo
actually does that check, does it?

If "cryptographic check" refers to verification of the MD5sum, then it's
a mistake to use MD5 in 2019.

If the idea is that MD5 is used for speed, full SHA256 is indeed a bit
slower than MD5 ("openssl speed md5 sha256" suggests to me that SHA256
operations take roughly twice as long as MD5 operations).  But unless
you're on a blazing fast Internet connection, the delay of downloading
is likely much much larger than the computational cost of sha256.  (and
if you're on a blazing fast Internet connection, you probably don't need
jigdo anyway)

I don't think you're "lowly" at all, Thomas!  And i'm not a "critic" of
Steve's.  This discussion isn't meant to be personal in any way.

I really appreciate the work you've done (and continue to do) on
xorriso, and i appreciate the work Steve has done (and continues to do)
across the entire Debian project :)

But I'm concerned that jigdo's lack of maintenance has negative effects
on the rest of the debian ecosystem, and i'd really like to get that
cleaned up one way or another.

If there are a lot of active users of jigdo, then there needs to be
comparably active maintenance.  If there aren't a lot of users (or if
other techniques for mirroring optical media, like bittorrent, are
better-maintained), then maybe it's time to retire jigdo and let people
use their limited energies on other projects.

Regards,

#887831#74
Date:
2019-10-23 09:03:39 UTC
From:
To:
Hi,

Daniel Kahn Gillmor wrote:

The MD5s in .jigdo and .template are not intended to counter an attack.
They serve as keys to create a relation between items of both files, and
they serve as transport check (where other protocols have things like CRC32,
easier to understand but by far not as good as MD5).

Believe me that i know what is currently considered safe and what is not.
(Harder is to convince myself that allegedly-safe is really safe.)

My advise for protecting against counterfeit ISOs is to apply the
verification chain of SHA512SUMS.sign and SHA512SUMS, which is regrettably
not yet documented as a whole by Debian, but scattered at
https://www.debian.org/CD/faq/#verify
https://www.debian.org/CD/verify
and still not giving all info needed about interpreting gpg output.
I try to propose a complete verification procedure in
https://wiki.debian.org/JigdoOnLive#Verify_the_Debian_Live_download
and repeat it in more sparse form after each download step
https://wiki.debian.org/JigdoOnLive#If_needed.2C_work_around_a_shortcoming_of_older_jigdo-lite
https://wiki.debian.org/JigdoOnLive#Verify_the_downloaded_ISOs

No offense taken. I am happy with being part of the bread slices around
the Debian ISO production sandwich.

Of course, i do not perceive your criticism towards jigdo as personal
towards me or Steve McIntyre. It is just that the problem you cope with is
in the sausage-and-salad layers of ISO production. And that is Steve's
realm.
(For example how to obtain in
https://sources.debian.org/src/debian-cd/3.1.26/tools/grab_md5/
 the path of the .deb file in order to compute the MD5 by own means
 without relying on package management information.)

I propose to change grab_md5 so that it does not expect MD5s in package
management information but rather computes them by md5sum.

This would enable a solution to bug #942893 without creating the need
for a format change in .jigdo and .template, and without the need for
testing for subtle regressions.


Have a nice day :)

Thomas

#887831#79
Date:
2019-11-11 18:23:54 UTC
From:
To:
[ Sent to multiple people and Debian bugs - please respect the
  reply-to and follow up on the debian-cd list if you have
  replies/comments. ]

Hi folks!

For a while we've been working to move away from using MD5 in various
parts of Debian, and jigdo is one of the last few things that's still
using it now. We've had a few bugs raised about this (#887837 and
#887831) and quite some discussion recently. I've been hacking on
jigdo and jigit to add support for a new v2 jigdo format which
switches from using md5 for internal checksumming to using sha256
instead, and I'm just about done now. I have a few remaining things to
do next, that I'd like to ask for some help with (please!) - see
further down. Prompt responses would be appreciated if possible.

jigdo
=====

I've extended jigdo to support both formats (old and new). Building a
new jigdo/template pair requires the user to specify which format they
want, while creating/verifying an image will auto-detect the format
automatically from the input data. I think that is clearly the best
design here.

I'm *most* worried about updating the various clients that people may
have in the field, using jigdo-lite/jigdo-mirror to make ISO images
from the jigdo data that we release with Debian, so that was my first
priority. I'm *not* aware of anybody actually using jigdo-file itself
to create new jigdo/template pairs these days, but I've done the right
thing anyway and added support for sha256 here too.

I've forked from Richard's last 0.7.3 release, and put it into my own git
server at

https://git.einval.com/cgi-bin/gitweb.cgi?p=jigdo.git;a=shortlog;h=refs/heads/upstream

along with the various fixes that we already had in Debian since that
release.

I've built and tested binaries locally with both jigdo formats,
including on Windows. All looks good here. \o/

jigit/libjte
============

I've also updated and extended my own jigit/libjte code to work with
both formats, and I'm about to release those. The changes are not too
big, and the external API for libjte is *very* close to what we had
before. I've already updated a local copy of xorriso to use it, and
the changes are tiny! \o/

genisoimage
===========

I am *not* planning to update my code in genisoimage to use the new
jigdo v2 format. We don't use genisoimage at all in the Debian images
team any more, having moved to xorriso instead. The only reason to
even think about updating genisoimage would be for powerpc
images. While the debian-ports people are still supporting powerpc and
periodically releasing new unofficial CD/DVD images for it, I don't
think jigdo is needed here. *If you think differently*, let me know...

Publishing the new format
=========================

debian-cd and some of our backend setup on our cdimage sites will need
some minor updates to support the new sha256 format as well, but
that's not urgent yet. We must *not* switch to publishing the new v2
jigdo format for a while (I'm thinking maybe 12 months?), to give
people the chance to update their clients. I also don't want to leave
this *too* long, as the Debian ftpmaster team and others would like to
ditch md5 soon.

We'll need to make noise about this, and update web pages etc. to
mention the change. New links to new tools, etc.

Richard
=======

With your blessing, I'd like to release my new code as jigdo version
0.8.0. If you're ok with that, could you please update your jigdo web
pages to mention this? I'll add a page at

https://www.einval.com/~steve/software/jigdo/

that you can link to. I'll add some docs, and links back to you (of
course!) and download links for Windows binaries etc. So far I've left
the creator information in newer jigdo files pointing at your site as
you're the inventor, but I'm also happy to change that if you'd like
and reduce your web traffic - just let me know please! :-)

Mattias
=======

You're the person normally working with people using jigdo tools to
mirror Debian's CD/DVD releases. We'll need to ask people to update
all their tools to enable using the new v2 format. What systems are
they normally using? I'm guessing a mix of Debian systems of various
versions, plus maybe a few other OSes? I'm happy to do Debian
backports builds of the new tool versions to help support people, but
I don't know:

 (a) what else might need to be supported
 (b) what timescale these people would be happy with or updates

Obviously, we don't want to be pushing new format versions until the
mirror network is ready to take them. But we want that to be as soon
as practically possible.

Thomas
======

You've done an awesome job with xorriso and the libjte integration!
It's been really easy to drop in my new libjte code and have xorriso
generate the new format. I've got a simple diff right now that I'm
just cleaning up and will send you shortly.