#791362 perl: build timezone affects LOCALTIME_{MIN,MAX}

Package:
src:perl
Source:
perl
Submitter:
Niko Tyni
Date:
2020-11-16 16:12:04 UTC
Severity:
minor
Tags:
#791362#5
Date:
2015-01-02 13:38:37 UTC
From:
To:
Hi!

While working on the “reproducible builds” effort [1], we have noticed
that perl could not be built reproducibly.

The attached patches will fix that with our current experimental
framework. I hope the description of each patch is enough to understand
their purpose.

 [1]: https://wiki.debian.org/ReproducibleBuilds

#791362#10
Date:
2015-01-09 19:40:15 UTC
From:
To:
Thanks, this is awesome! I only had a quick look so just a couple of
notes and questions for now.

Is this because of the date header in manpages? Setting the POD_MAN_DATE
environment variable could/should suffice for that, I think. See
debian/patches/fixes/pod_man_reproducible_date.diff

I expect this needs to be made configurable for upstream to accept
it. Also, it might be safer to replace __DATE__ and __TIME__ with
some placeholders rather than dropping them, at least until this is
upstreamed. There might well be some crazy things parsing 'perl -V'
output or something like that which could choke if the lines are left
out altogether.


and later

I assume the first is the correct one.

Not sure the 'touch' part belongs in gen-patchlevel, which currently
just prints to STDOUT. But I can see it would be nice to pick up the
mtime while reading the patches anyway. I wonder if we could/should
use the changelog date instead, though. The whole thing of writing
$patchlevel_date into perlbug to see how old this perl is feels weird...

#791362#15
Date:
2015-01-09 20:05:28 UTC
From:
To:
Niko Tyni:

This is needed to have reproducible mtimes in data.tar and control.tar.
This is done right before calling dpkg-source.

I went ahead with removing the values because there were already
#ifdefs. But maybe the value of cf_time should be passed through `-D` or
something similar. I'm not sure what the best way is.

Oops. The later is the one to pick. 'D' is incompatible with 'u'.

I believe this is a matter of taste. :)

Thanks for having a look,

#791362#20
Date:
2015-01-22 20:14:35 UTC
From:
To:
Ah, right. Sorry about that.

A few more notes:

- the build system also embeds information about the build host, at
  least the kernel version and hostname. Those need to be stripped too.
  From 'perl -V':

    osname=linux, osvers=3.16.0-4-amd64, archname=x86_64-linux-gnu-thread-multi
    uname='linux estella 3.16.0-4-amd64 #1 smp debian 3.16.7-ckt2-1 (2014-12-08) x86_64 gnulinux '

  I assume varying uname et al. isn't actively tested yet?

- I would expect some of the generated manual pages to embed the build
  date, at least for patched modules like Net::SMTP. Are builds from
  different days compared currently and/or are you setting POD_MAN_DATE
  externally? (see #759405)

- I don't think 0003-Allow-cf_time-to-be-set-externally is needed,
  as config.over can override cf_time without it AFAICS.

Sorry I'm a bit slow with this... :)

#791362#25
Date:
2015-01-23 10:20:44 UTC
From:
To:
Hi Niko,
[...]

no, not yet. and varying hostname is only tested since last week, so most
packages have not yet been tested for that. (but will be in a few months.)

are there ways to "properly" fake uname or do I really need to setup something
in qemu to test^wsimulate builds under different kernels?


cheers,
	Holger

#791362#30
Date:
2015-01-23 10:47:27 UTC
From:
To:
A quick search indicates that there's no separate namespace for other
uname(2) information than the host name and domain name.  This suggests
that something like http://www.bstern.org/libuname/ is needed. I'm not aware
of anything in Debian already that does that. Time for an RFP maybe :)

#791362#35
Date:
2015-01-23 11:03:25 UTC
From:
To:
Hi Niko,

it builds fine but doesn't work:

jenkins@jenkins:~/u/libuname-1.0.0$ make
gcc -Wall -Werror -O2 -fPIC   -c -o libuname.o libuname.c
if [ "`uname -s`" = "SunOS" ]; then \
                ld -G -dy -z text -Qn -o libuname.so libuname.o; \
        else \
                ld -shared -fPIC -o libuname.so libuname.o; \
        fi
jenkins@jenkins:~/u/libuname-1.0.0$ LD_PRELOAD=$PWD/libuname.so
LIBUNAME='Linux;bar;2.6.15;#1;Mon Feb 37 22:33:44 UTC 2006;i686;unknown' uname
-a
uname: symbol lookup error: /var/lib/jenkins/u/libuname-1.0.0/libuname.so:
undefined symbol: dlsym


cheers,
	Holger

#791362#40
Date:
2015-01-23 15:45:12 UTC
From:
To:
This is resolved by the attached patch.
#791362#45
Date:
2015-05-04 12:28:04 UTC
From:
To:
Hi!

Here's an update after rebasing my patches on 5.20.2-4.

Niko Tyni:

We do now test it by calling `linux64 --uname-2.6`. It will make the
version look like 2.6.56-4. And indeed, this is an issue.

The kernel version shows in Config.pm (`osvers`), Config_heavy.pl
(`osvers`).

The full uname is shown in Config_heavy.pl (in a comment, and in
`myuname`), in CORE/config.h (in a comment, in `OSVERS`), and in the
binaries.

I'm not sure what's the best answer here. Always use 2.6.42? As in
Debian we can't really know which version of the kernel the package is
going to be used with, it should stay compatible with older kernels as
much as possible.


Another issue that surfaced now that we are doing timezone variations is
that LOCALTIME_MIN and LOCALTIME_MAX gets different values depending on
the value of the TZ environment variable.

This shows in CORE/conf.h, in Config_heavy.pl, and in the binaries.

If I read it right, `sLOCALTIME_min` and `sLOCALTIME_max` can be
overloaded from `Configure`.

The minimum I had on my amd64 system is with TZ=UTC-24, -62167305600.
The maximum is with TZ=UTC and is 67768036191590399.

It feels like a bug to have something that can be configured through an
environment variable on a running system affect what gets encoded in the
binary.

#791362#50
Date:
2015-06-01 21:51:54 UTC
From:
To:
Hello,

Thanks for the update! I noticed that you didn't include your
rebased patches as attachments, however.

We've now uploaded perl 5.22.0~rc2-2 to experimental, and that will
be a good base on which to forward patches upstream, so if you were
able to do one more rebasing that'd be excellent.

Cheers,
Dominic.

#791362#55
Date:
2015-07-03 20:16:46 UTC
From:
To:
clone 774422 -1
retitle -1 perl: build timezone affects LOCALTIME_{MIN,MAX}
severity -1 normal
thanks

Thanks. I had a look at this and will try to get a reproducible 5.22
package into experimental soonish. It looks like the only thing that
needs upstream source changes (as opposed to configuration) is the
__DATE__/__TIME__ stuff. I understand the 'ar D' patch isn't necessary
anymore since binutils was changed.

I'll discuss at least the __DATE__ part upstream, but I think disabling
it at this phase should be good enough.
maybe we shouldn't care about those at this point.

I suspect the uname (stored as $Config{myuname}) doesn't matter much:
codesearch.debian.net only finds libcrypt-openssl-x509-perl using it
(and even that should probably use $^O instead, which gives the runtime
OS name instead of the build time one.)

As for osvers, which has much more hits, I think it should be good enough
to hardcode a version that approximates a ~current Debian stable kernel.

My current candidate for an override in config.debian is this monstrosity:

myhostname=localhost
case "$osname" in
  linux)
      osvers=3.16.0
      osdesc="#1 smp debian $osvers"
      os=gnulinux
      ;;
  gnu)
      osvers=0.6
      osdesc="gnu-mach"
      os=gnu
      ;;
  gnukfreebsd)
      osvers=9.0
      osdesc="#0"
      os=gnukfreebsd
      ;;
esac
if [ -n "$osdesc" ]; then
  machine_uname=$(uname -m | tr '[A-Z]' '[a-z]' | sed -e "s,['/],,g")
  myuname="$osname $myhostname $osvers $osdesc $machine_uname $os "
fi

which probably is too much work for little gain.

Not sure if "leaking" uname -m output is appropriate, but making
that constant between architectures doesn't feel right either.

This feels like a bug to me too, and should be handled separately.
I'm cloning this and will export TZ=UTC in debian/rules, at least
for now.

#791362#70
Date:
2019-10-23 19:40:00 UTC
From:
To:
Control: found -1 5.30.0-8

The TZ=UTC part was accidentally dropped in the build system debhelper
conversion for 5.30 packaging. This resulted in a reproducibility
regression that Holger pointed out to me on IRC (thanks!).

I'll re-instate TZ=UTC in 5.30.0-9 or so, but clearly the underlying
issue remains.

#791362#77
Date:
2019-10-28 11:28:32 UTC
From:
To:
Hi!

Just noticed this change from the changelog. :) UTC is not really a
proper timezone specification, the format requires an offset, so here
it would be UTC0 (see «man timezone»).

Thanks,
Guillem

#791362#82
Date:
2019-10-29 19:00:15 UTC
From:
To:
Oh! Thanks for the note. This is probably a very common misconception.
I think the reproducible builds docs have advised setting TZ=UTC in
the past, and I see https://reproducible-builds.org/docs/timezones/
mentions it currently.

Also, codesearch.debian.net reports 95 packages matching TZ=UTC
but only two match TZ=UTC[0-9]. Time for a mass bug filing? :)

#791362#87
Date:
2020-11-14 17:27:26 UTC
From:
To:
Hi Niko,

I'm struggling to see the practical problem with having the timezone
vary LOCALTIME_{MIN,MAX} (other than reproducibility, which AIUI has
already been addressed). I don't agree with the starting point that
an environment variable shouldn't be able to influence the contents
of the binary (this is clearly a very common and necessary pattern).

Could you elaborate on your reasoning for keeping this bug open?

Thanks
Dominic

#791362#94
Date:
2020-11-16 16:08:46 UTC
From:
To:
Control: submitter -1 !
Control: severity -1 minor
Control: tag -1 upstream

I'm not aware of any practical problems here. I suspect nothing
uses $Config{sLOCALTIME_max} et al.

Reproducibility has been addressed in a Debian-specific way.  Ideally,
it would be fixed upstream so that the build result would be reproducible
regardless of the build timezone (which we are currently overriding.)

I think it depends on the environment variable and its main purpose.
Something like BUILD_BZIP2 does and should influence the result, that's
what it's there for. But what's the use for encoding the local timezone
into the binaries? Binaries can be copied between hosts in different time
zones (our buildd results certainly are), users connect to hosts from
different time zones, and even hosts (think laptops) can move between
time zones.

I don't really mind closing this, it's just a minor detail and I obviously
haven't got around to doing anything about it so far. But I do think
the current TZ=UTC solution is more a workaround than a fix.

I'm updating the metadata at least, feel free to close if you're not
convinced :)