#1118385 texlive-binaries: corner-case PDF image nondeterminism

Package:
texlive-binaries
Source:
texlive-binaries
Description:
Binaries for TeX Live
Submitter:
Aaron M. Ucko
Date:
2026-05-19 20:17:03 UTC
Severity:
normal
#1118385#5
Date:
2025-10-19 02:57:32 UTC
From:
To:
I have found that when a PNG image's row width does not correspond to a
whole number of bytes, the low-order bits of the last byte in each
uncompressed row can fluctuate between pdflatex runs, at least when
libpng can perform SIMD-assisted decompression, yielding
nondeterministic output even when arranging to supply predetermined
timestamps.  (Moreover, this fluctuation can affect the length of the
resulting compressed stream, slightly shifting the file position of
subsequent content.)

In particular, such nondeterminism occurs when including images with
4-bit color maps and odd row lengths, such as FLTK's valuators.png [1].
I tried to put together a minimal example, but the bits in question
came out all zero, presumably because there hadn't yet been enough
memory churn.  At any rate, I suspect it would help for the PNG-reading
code to prezero the last byte of the row buffer, at least in this
scenario.

Could you please take a look?

Thanks!

#1118385#10
Date:
2025-10-19 03:11:02 UTC
From:
To:
"Aaron M. Ucko" <ucko@debian.org> writes:

[1] https://salsa.debian.org/fltk-team/fltk1.4/-/blob/main/documentation/src/valuators.png?ref_type=heads

#1118385#15
Date:
2025-10-19 10:56:23 UTC
From:
To:
On 10/19/25 04:57, Aaron M. Ucko wrote:

Hi Aaron,

Just two questions, which came to my mind just after reading the report:

1. Could it be an issue in libpng* too?
2. How did you learn the issue? You did notice that fltk* does FTBR and
then found the root cause?

Hilmar--
Testmail

#1118385#20
Date:
2025-10-19 14:22:32 UTC
From:
To:
Hilmar Preuße <hille42@web.de> writes:

Hi, Hilmar.  Thank you very much for the quick reply!

Perhaps, though its documentation makes no specific promises on the
matter AFAICT and few (if any) other consumers are likely to care.

Yes.

#1118385#27
Date:
2026-04-28 17:53:38 UTC
From:
To:
Am 19.10.2025 um 04:57 schrieb Aaron M. Ucko:

Hello Aaron,

TL 2026 has entered the unstable archive and will (hopefully) end up in
Debian testing. Does it makes things better?

Hilmar

#1118385#32
Date:
2026-05-03 02:18:03 UTC
From:
To:
"Preuße, Hilmar" <hille42@web.de> writes:

No such luck.  Thanks for checking!

#1118385#37
Date:
2026-05-10 18:45:00 UTC
From:
To:
Am 03.05.2026 um 04:18 schrieb Aaron M. Ucko:

Hello Aaron,

Just one difference noticed: when comparing the build logs [1] for the
two compared builds, one notices that the topmost cmake calls differ by
the option "-DBUILD_TESTING:BOOL=OFF". Could this have an impact?

Hilmar

[1]
https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/fltk1.4.html

#1118385#42
Date:
2026-05-10 21:45:33 UTC
From:
To:
Am 10.05.2026 um 20:45 schrieb Preuße, Hilmar:

Hello Aaron,

No, does now seems to be the case. Sorry!

Hilmar

#1118385#47
Date:
2026-05-11 01:40:56 UTC
From:
To:
"Preuße, Hilmar" <hille42@web.de> writes:

No, that difference looks irrelevant; please bear in mind that fltk1.3
runs afoul of this bug too but uses the traditional Autotools-based
build system, using CMake only to populate /usr/lib/fltk for the sake of
CMake-based consumers.

Fair question, though!

#1118385#52
Date:
2026-05-11 07:09:01 UTC
From:
To:
Am 19.10.2025 um 04:57 schrieb Aaron M. Ucko:

Hello,

At least I'm able to reproduce the issue by running pdflatex 10 times
over the same set of input files and getting 10 files having the same
size but different checksum.

The input files are of course not minimal. As next step I'll contact
pdfTeX upstream. They should be able to clarify if the code generating
that alternating content is located on the pdfTeX source code or in one
of the libraries it is linked with.

Hilmar

hille@rasppi3:~/devel/TeXLive/Upgrade_Test/arm64-sid/home/hille/1118385
$ ls -l
total 109240
drwxr-xr-x 2 hille hille    36864 May 11 08:55 latex
-rw-rw-r-- 1 hille hille 11178538 May 11 08:55 refman_10.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:18 refman_1.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:22 refman_2.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:26 refman_3.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:31 refman_4.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:35 refman_5.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:39 refman_6.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:43 refman_7.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:47 refman_8.pdf
-rw-rw-r-- 1 hille hille 11178538 May 11 08:51 refman_9.pdf
hille@rasppi3:~/devel/TeXLive/Upgrade_Test/arm64-sid/home/hille/1118385
$ md5sum refman_*
fad47cf92f41fcee69bcaba215fc54fb  refman_10.pdf
70e2b4061750d4f70089a1571d59bef1  refman_1.pdf
5b897de682ef50f5ac423b60fdedd9b7  refman_2.pdf
c4862f7dd9aba1a41c66577e6cc4908f  refman_3.pdf
2b987c9a15fafe4d83e0e0745d2cd197  refman_4.pdf
a8a28203556c3fd7b261d8a3532b198e  refman_5.pdf
0dc8adff9c21b34633786cf993edd2f3  refman_6.pdf
e3c4783783b132c8b6888d12144d6a58  refman_7.pdf
9c5bc607de841902487556fb4c080746  refman_8.pdf
a50e84bb0d679d778bd6d21980714ebd  refman_9.pdf

#1118385#57
Date:
2026-05-11 21:58:36 UTC
From:
To:
Control: forwarded -1 https://tug.org/pipermail/tex-k/2026-May/004330.html

Forwarded for now.

H.

#1118385#64
Date:
2026-05-12 02:37:32 UTC
From:
To:
Hilmar Preusse <hille42@web.de> writes:

Thanks!

As I recall, my pre-filing investigation found that libpng leaves any
trailing portion of the row's final byte unspecified.  It would of
course be possible to tighten that policy up, but there's normally no
need to, even from a reproducibility perspective, because such bits are
typically of no interest to consumers anyway.  As such, it falls on
pdfTeX and anything else that takes a similar approach to clear those
bits or otherwise populate them consistently.  In principal, doing so
after reading each row is safest, but observed behavior suggests that
clearing them up front should suffice.

#1118385#69
Date:
2026-05-12 13:46:01 UTC
From:
To:
Hi

Let me disagree strongly.

If it is not "in the interest to consumers", why should pdftex clean up
after a mess that libpng made.

If you (== Debian) want reproducibility, then it is **your** job to
clean up libpng artifacts, not downstream.

So this is a libpng problem, not pdftex problem.

Best regards

Norbert

#1118385#74
Date:
2026-05-12 21:38:30 UTC
From:
To:
Am 12.05.2026 um 04:37 schrieb Aaron M. Ucko:

Hello,

As far as I understood there are some (random) bytes inserted, which
remain invisible to the consumer but are nevertheless there and make the
file building not fully reproducible, correct? For the consumers of
course there is no need, but there is some kind of need to make file
creation deterministic.
This could be communicated to the piece of code generating the
randomness. If I understand correctly it has to be found out, where that
code is located. If you are sure, it is libpng, we should at least clone
that bug to evaluation.

H.

#1118385#79
Date:
2026-05-12 21:38:32 UTC
From:
To:
Hi again Hille,

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1118385

Towards the (current) end of that bug it seems suspicion is falling on
libpng. I guess I will cc the bug here.

    The submitter told that the issue occurs only for some kind of pictures,

Sounds reasonable.

    would like to have clarified in the first step if the characters which
    differ are generated by code in the pdfTeX binary, or if we have to
    blame one of the libraries pdfTeX is linked with.

I don't have the brain to be able to look at that hex dump and have any
real clue as to where it came from. It could be part of a png image with
those "unspecified" bytes as mentioned in the bug report (/FlateDecode
.. stream ...). Or maybe it's not. I don't know.

If it is due to libpng, though, I think I agree with Norbert's last
comment that it should be fixed in libpng, not in every program that
uses libpng.  (I definitely won't be able to do anything about that.)

    The testbed I have currently is 25 MB, i.e. far from being
    minimal. I can provide it, but only on request.

Well, if you send it (or a url to it), I can at least see if the bug
reproduces for me. I see the OP in the bug report was using x86_64, so
it's probably not system-specific.

As a workaround for the reproducible doc: what occurs to me is that
perhaps doing a(n insignificant) transformation on the particular
image(s) will avoid the bug, wherever/whatever it is, and make the doc
reproducible. --best, karl.

#1118385#84
Date:
2026-05-12 23:49:13 UTC
From:
To:
difference.  With Hilmar's example files, the command line

     diff -sca refman_1.pdf refman_2.pdf

gave

*** refman_1.pdf	Mon May 11 18:40:01 2026
--- refman_2.pdf	Mon May 11 18:40:05 2026
***************
*** 84287,84293 ****
   /W [1 3 1]
   /Root 111760 0 R
   /Info 111761 0 R
! /ID [<BAC5ECC86743A24DB489859FD6D0C0AF> <BAC5ECC86743A24DB489859FD6D0C0AF>]
   /Length 260088
   /Filter /FlateDecode
   >>
--- 84287,84293 ----
   /W [1 3 1]
   /Root 111760 0 R
   /Info 111761 0 R
! /ID [<BA7ACF0ECED2C16FCB4BF449BF00000B> <BA7ACF0ECED2C16FCB4BF449BF00000B>]
   /Length 260088
   /Filter /FlateDecode
   >>

The difference is solely in the /ID line whose contents form a file identifier
(according the the pdf standard).

2. I don't think this has anything to do with libpng, since I was able to
reproduce the same problem on a trivial file, hello.tex, without included graphics:

    \documentclass{article}
    \begin{document}
    Hello
    \end{document}

I used the following script to test whether the pdf file changes on successive
compilations:

     for i in 1 2; do
         pdflatex -interaction=batchmode
"\pdfinfo{/CreationDate(D:20260429034606Z)/ModDate(D:20260429034606Z)}\input{hello.tex}"
         mv hello.pdf hello${i}.pdf
         sleep 1
     done
     diff -sca hello1.pdf hello2.pdf

The result was differences that were of the same form as for Hilmar's example.

BUT: If I commented out the sleep 1 line, most of the time diff reports
identical times.  (This is all on TeXLive 2026 on macOS.)

I infer that the contents of the /ID line are obtained from some value of a
relevant time to a resolution perhaps of 1 sec.  The definition of "File
Identifiers" in the PDF standard indicates that the relevant time is the
current time.  This behavior is not impacted by the setting of the dates/times
in the \pdfinfo command.

Best,
John Collins

#1118385#89
Date:
2026-05-13 02:36:11 UTC
From:
To:
"Preuße, Hilmar" <hille42@web.de> writes:

My understanding is that the relevant sequence of events is as follows:

- pdfTeX's write_png_palette passes libpng a heap-allocated buffer whose
  contents are already nondeterministic for some reason.  (I'm not clear
  on how that comes about; I'd been wondering if it could be a side
  effect of ASLR, but running under setarch -R doesn't help, and I've
  confirmed that setarch -R *does* stabilize ldd output on the host in
  question.)
- libpng correctly populates the bits that hold actual pixel data and
  leaves any trailing bits exactly as is.  It could of course clear them
  instead, but that would add (a little) overhead for what is typically
  no benefit.
- pdfTeX writes the the row buffer out as is, modulo compression.

#1118385#94
Date:
2026-05-13 04:25:43 UTC
From:
To:
export SOURCE_DATE_EPOCH=`perl -e "print time();"`
export FORCE_SOURCE_DATE=1
for i in 1 2; do
pdflatex -interaction=batchmode "\pdfinfo{/CreationDate(D:20260429034606Z)/ModDate(D:20260429034606Z)}\input{hello.tex}"
mv hello.pdf hello${i}.pdf
sleep 2
done

I obtain the same hello1.pdf and hello2.pdf:
cmp hello1.pdf hello2.pdf
echo $?
0

On the other hand, with

export SOURCE_DATE_EPOCH=`perl -e "print time();"`
export FORCE_SOURCE_DATE=1
for i in 1 2; do
pdflatex -interaction=batchmode "\pdfinfo{/CreationDate(D:20260429034606Z)/ModDate(D:20260429034606Z)}\input{valuators.tex}"
mv valuators.pdf valuators${i}.pdf
sleep 2
done

%valuators.tex
\documentclass{article}
\usepackage{graphicx}
\begin{document}
Hello.

\includegraphics[width=8cm]{valuators.png}
\end{document}

where valuators.png is obtained from
https://salsa.debian.org/fltk-team/fltk1.4/-/blob/main/documentation/src/valuators.png?ref_type=heads

I obtain different valuators1.pdf and valuators2.pdf:
cmp valuators1.pdf valuators2.pdf
valuators1.pdf valuators2.pdf differ: char 421, line 26

Thanks,
Akira

#1118385#99
Date:
2026-05-13 14:21:58 UTC
From:
To:
I agree completely.  That also means that the posted example of
non-reproducibility was irrelevant; that example was my starting point, i.e.,
the files ref_man1.pdf and ref_man2.pdf

Sorry for the noise.

John

#1118385#104
Date:
2026-05-13 19:44:15 UTC
From:
To:
Am 13.05.2026 um 06:25 schrieb Akira Kakuto:

Hello Akira,

Many thanks for pointing that out! Until now I was happy, that I could
reproduce the issue at all, did not find the time yet to create a
minimal example...or say better I did not try b/c the OP told it would
not that easy. ;-)

Hilmar

#1118385#109
Date:
2026-05-13 20:54:06 UTC
From:
To:
    \includegraphics[width=8cm]{valuators.png}

Thanks much, Akira.

So ... can anyone try to find a patch for libpng? Maybe AI can help :) :(,
if anyone knows how to ask it.

At least, it doesn't make sense to me to change pdftex + luatex +
dvipdfmx + non-TeX ... --thanks, karl.

#1118385#114
Date:
2026-05-14 22:09:47 UTC
From:
To:
Am 13.05.2026 um 22:54 schrieb Karl Berry:

Hello Karl,

My intention was not necessarily to get a fix here, but rather a first
statement *if* the bug(?) is in pdfTeX or in libpng. Maybe I contacted
the wrong mailing list, and should have better talked to the pdfTeX
mailing list.

I'll happily hand over the issue to the libpng maintainers, however I
could need some help, what needs to be changed. Is Adams description in
[1] correct?

Hilmar

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1118385#89

#1118385#119
Date:
2026-05-16 15:27:13 UTC
From:
To:
Am 15.05.2026 um 12:49 schrieb Norbert Preining:

Hello,

I'll assign that issue to libpng for now. The patch does not seem to
solve the issue for me, but this is up to the libpng (upstream) maintainers.

I rebuilt the Debian libpng package with the patch in place and
installed it into my test chroot, the pdftex binary is linked with the
lib in that package. The consequential pdflatex runs still generate pdf
files with different check sums.

Hilmar

#1118385#124
Date:
2026-05-16 18:06:22 UTC
From:
To:
I don't claim to understand libpng, but I see that it goes out of its way
not to overwrite the final bits. This is the commit that introduced the
change 15 years ago:
https://github.com/pnggroup/libpng/commit/fb5b3ac013b3ea6bf7f32871be75132380abbccd
They even added a test to make sure that the bits are not overwritten.

Are you positive that the patch has no other side effects? I couldn't
find the issue that prompted the change, but the behaviour is so
intentional...

Vincenzo

On Sat, 16 May 2026 at 18:30, Norbert Preining <norbert@preining.info> wrote:

#1118385#129
Date:
2026-05-17 13:52:04 UTC
From:
To:
Did you also set SOURCE_DATE_EPOCH, e.g.,like Akira did:

   export SOURCE_DATE_EPOCH=`perl -e "print time();"`
   export FORCE_SOURCE_DATE=1

Best regards,
John

#1118385#134
Date:
2026-05-17 21:34:11 UTC
From:
To:
Am 16.05.2026 um 19:29 schrieb Norbert Preining:

Hello Norbert,
all, i.e. with the unpatched libpng?
As Aaron wrote in his initial post he was unable to create a minimal
example, meanwhile it was quite easy as described by Akira. For me that
minimal example worked, but I suspect this is not the case everywhere.

Hilmar

#1118385#139
Date:
2026-05-17 21:42:04 UTC
From:
To:
Am 17.05.2026 um 15:52 schrieb John Collins:

Hello John,
For me it was sufficient to overwrite the \pdfinfo at the command line
of pdflatex, as Norbert did:

\pdfinfo{/CreationDate(D:20260429034606Z)/ModDate(D:20260429034606Z)}

That way one gets pdf files, which do not differ in the time stamps,
just in the byte block, this bug report is discussing about.

Hilmar

#1118385#144
Date:
2026-05-18 20:46:24 UTC
From:
To:
Am 18.05.2026 um 09:20 schrieb Norbert Preining:

Hello Norbert,

;-)

Yes, correct. I should have posted my "diff -a" earlier. I have only
seen the difference in the ID only and believed that would have been the
thing we are looking for. Sorry or confusion!

Hilnar

#1118385#149
Date:
2026-05-18 21:04:30 UTC
From:
To:
Am 18.05.2026 um 09:43 schrieb Norbert Preining:

Hello Norbert,

Glad to see your are still engaged.

[ Much stuff regarding Memory allocation deleted ]

Many thanks for engagement and more insight into the story. Great idea
to use that MALLOC_PERTURB_ to generate random heap content to trigger
the issue. I'm now really able to reproduce the issue and can confirm
that your proposed patch at least solves this specific issue.

Many thanks for that! I've subscribed to that discussion and will follow it.

Hilmar

#1118385#154
Date:
2026-05-18 21:28:43 UTC
From:
To:
Am 19.10.2025 um 04:57 schrieb Aaron M. Ucko:

Hi Adam,

Note sure, if you are subscribed to that bug in the meantime and follow
the discussion. Meanwhile it seems to be sure, that we look at a design
flaw in the libpng library. Norbert has forwarded the issue description
+ reproduction instructions to [1]. I'm subscribed to the discussion and
will follow it.

Hilmar

[1] https://github.com/pnggroup/libpng/discussions/864

#1118385#159
Date:
2026-05-19 20:14:41 UTC
From:
To:
Control: forwarded -1 https://github.com/pnggroup/libpng/discussions/864
Fix forwarded address.