#794583 ocaml: Allow setting arbitrary RNG seed in ocamlopt

#794583#5
Date:
2015-08-04 16:41:19 UTC
From:
To:
While working on the “reproducible builds” effort [1], we have noticed
that ocamlopt relies on temporary files whose names are generated
randomly and are part of the output files' symbols.

Therefore, we need a way to make these names determinist. For instance,
reading an environment variable in the main function of ocamlopt
(driver/optmain.ml) and calling “Random.seed 0” if it is set would be
perfect.
Using OCAMLPARAM (driver/compenv.ml) would work as well.

 [1]: https://wiki.debian.org/ReproducibleBuilds

Regards,
Valentin

#794583#10
Date:
2015-08-09 19:46:32 UTC
From:
To:
I looked into this a few months ago but my OCaml is very rusty (it would
require a change in at least 2 places).

My thoughts are that we can -- and should -- achieve the end result by
making the calculation deterministic in all cases, ie. moving away from
using an RNG altogher for this and basing the filename based on its
contents.

There may also be a solution where we simply don't include the filename
in the .a or overwrite it with some determistic one (as described above)
even if random *temporary* file is used during the build. After all,
they are files under /tmp/ which never exist in the binary package.


Regards,

#794583#15
Date:
2015-08-10 10:17:28 UTC
From:
To:
Le 04/08/2015 18:41, Valentin Lorentz a écrit :

ocamlc relies on temporary files as well. In general, whatever is done
for ocamlopt should be done for ocamlc as well.

The generation of temporary file names uses its own RNG (cf.
stdlib/filename.ml), that would be the place to change its initialization.

I am a bit worried to have predictable names for files that end up in
/tmp, though. For building packages, it should be fine I guess. I have
to think more about it.

Cheers,

#794583#20
Date:
2015-08-10 10:25:53 UTC
From:
To:
Le 09/08/2015 21:46, Chris Lamb a écrit :

You mean, generate the file using the temporary name then rename it into
something that uses its hash?

After experimenting, it turns out the filename is stored in .o files
(and the final executable), and I couldn't find a way to prevent it
(except compiling with gcc -x c -c - < file.c). Is there a way to
remove/rename the file name that appears in a .o file?

Cheers,

#794583#25
Date:
2015-08-10 14:03:49 UTC
From:
To:
Right, that's the whole problem :)

Well, something like that. We need to be a little clever otherwise we
introduce one of those tempfile security issues during build. We
probably can't just store an entirely deterministic name (eg.
"always-the-same.o") as they probably need to be unique..

Well, we can strip it. I posted a diff to dh_ocaml here (as well as some
other details):

http://anonscm.debian.org/cgit/reproducible/notes.git/tree/issues.yml#n506

.. and as it mentions we could even do this in dh_strip. Again, I simply
don't know enough about the OCaml toolchain to know whether it would be
fine to strip these out, particularly with regards to debugging.


Regards,

#794583#30
Date:
2015-08-25 14:00:32 UTC
From:
To:
Le 04/08/2015 18:41, Valentin Lorentz a écrit :

See #795784, #796336 and #786913.

I don't agree with this conclusion; I'd rather use a way to not record
those random names in the first place, or set them to some sane value
without messing with file names.

GCC also uses temporary files whose names are generated randomly (this
can be seen with the -v option). But it arranges for these random names
to not appear in compiled objects.

For example, with assembly files, the name of the "source" file (which
is then recorded in compiled objects) can be given with a ".file"
directive. gcc adds them to its assembly files. And adding such
directives to assembly files generated by ocamlopt solves #795784 and
#796336.

For #786913, temporary files are C files. Ideally, a ".file" counterpart
should exist for C files (I thought of "#line" cpp directives, but they
don't work... maybe we should make them work?). However, I've found a
way to tell gcc to not record the file name, using stdin: gcc -x c -c -o
foo.o < foo.c. I would very much prefer a ".file"-like directive, though.

I am not thrilled by this proposition. Filename.temp_file is the
equivalent of mkstemp, and mkstemp doesn't have this "feature".


Cheers,

#794583#37
Date:
2015-11-07 14:40:32 UTC
From:
To:
Hi,

FYI:


cheers,
	Holger
---------- Forwarded Message ---------- Subject: Re: OCaml and reproducible builds Date: Samstag, 7. November 2015 From: Hannes Holger, Gabriel wrote a patch (http://caml.inria.fr/mantis/file_download.php?file_id=1543&type=bug) which avoids that /tmp/ocamlppXXXX ends up in Location.input_file, and thus into the resulting binary. Could you please try this patch on your reproducible infrastructure and report results here? If it works, this should be upstream the patches into OCaml [together with your other reproducible patches]. Thanks (to both Gabriel for his quick hacking, and for Holger for pushing me to finally start a conversation about reproducible builds), hannes
-------------------------------------------------------
#794583#42
Date:
2015-11-07 22:09:49 UTC
From:
To:
Hi,

AFAIK, that patch was tested and broke "ocamldoc -pp", which caused
FTBFS
of mlpost (at least).

See
http://anonscm.debian.org/cgit/pkg-ocaml-maint/packages/ocaml.git/commit/?id=0f93e9ee91dfa37f2e6209c306fe8f2dbc46e540

Regards,

#794583#47
Date:
2015-12-13 19:05:36 UTC
From:
To:
Hello,

from the OCaml trenches:
- gasche has pulled the OCamldoc patch upstream (see
https://github.com/ocaml/ocaml/pull/321)
- xavier has pushed a patch which emits a .file "" directive at the
beginning of each assembly file
(https://github.com/ocaml/ocaml/commit/eef84c432a4fcecc83f02d81b347cf819c69df9f)
(discussion of why setting Location.input_file to the original file
breaks is in http://caml.inria.fr/mantis/view.php?id=7037#c15059)


It would be great to test xavier's patch in your setup, but it does not
apply to 4.02.3 release (due to missing files).  Please find attached a
patch which cleanly applies to 4.02.3 (a merge of glondu@ and xavier's
work).


hannes