While working on the “reproducible builds” effort [1], we have noticed that ocamlopt relies on temporary files whose names are generated randomly and are part of the output files' symbols. Therefore, we need a way to make these names determinist. For instance, reading an environment variable in the main function of ocamlopt (driver/optmain.ml) and calling “Random.seed 0” if it is set would be perfect. Using OCAMLPARAM (driver/compenv.ml) would work as well. [1]: https://wiki.debian.org/ReproducibleBuilds Regards, Valentin
I looked into this a few months ago but my OCaml is very rusty (it would require a change in at least 2 places). My thoughts are that we can -- and should -- achieve the end result by making the calculation deterministic in all cases, ie. moving away from using an RNG altogher for this and basing the filename based on its contents. There may also be a solution where we simply don't include the filename in the .a or overwrite it with some determistic one (as described above) even if random *temporary* file is used during the build. After all, they are files under /tmp/ which never exist in the binary package. Regards,
Le 04/08/2015 18:41, Valentin Lorentz a écrit : ocamlc relies on temporary files as well. In general, whatever is done for ocamlopt should be done for ocamlc as well. The generation of temporary file names uses its own RNG (cf. stdlib/filename.ml), that would be the place to change its initialization. I am a bit worried to have predictable names for files that end up in /tmp, though. For building packages, it should be fine I guess. I have to think more about it. Cheers,
Le 09/08/2015 21:46, Chris Lamb a écrit : You mean, generate the file using the temporary name then rename it into something that uses its hash? After experimenting, it turns out the filename is stored in .o files (and the final executable), and I couldn't find a way to prevent it (except compiling with gcc -x c -c - < file.c). Is there a way to remove/rename the file name that appears in a .o file? Cheers,
Right, that's the whole problem :) Well, something like that. We need to be a little clever otherwise we introduce one of those tempfile security issues during build. We probably can't just store an entirely deterministic name (eg. "always-the-same.o") as they probably need to be unique.. Well, we can strip it. I posted a diff to dh_ocaml here (as well as some other details): http://anonscm.debian.org/cgit/reproducible/notes.git/tree/issues.yml#n506 .. and as it mentions we could even do this in dh_strip. Again, I simply don't know enough about the OCaml toolchain to know whether it would be fine to strip these out, particularly with regards to debugging. Regards,
Le 04/08/2015 18:41, Valentin Lorentz a écrit : See #795784, #796336 and #786913. I don't agree with this conclusion; I'd rather use a way to not record those random names in the first place, or set them to some sane value without messing with file names. GCC also uses temporary files whose names are generated randomly (this can be seen with the -v option). But it arranges for these random names to not appear in compiled objects. For example, with assembly files, the name of the "source" file (which is then recorded in compiled objects) can be given with a ".file" directive. gcc adds them to its assembly files. And adding such directives to assembly files generated by ocamlopt solves #795784 and #796336. For #786913, temporary files are C files. Ideally, a ".file" counterpart should exist for C files (I thought of "#line" cpp directives, but they don't work... maybe we should make them work?). However, I've found a way to tell gcc to not record the file name, using stdin: gcc -x c -c -o foo.o < foo.c. I would very much prefer a ".file"-like directive, though. I am not thrilled by this proposition. Filename.temp_file is the equivalent of mkstemp, and mkstemp doesn't have this "feature". Cheers,
Hi, FYI: cheers, Holger---------- Forwarded Message ---------- Subject: Re: OCaml and reproducible builds Date: Samstag, 7. November 2015 From: Hannes Holger, Gabriel wrote a patch (http://caml.inria.fr/mantis/file_download.php?file_id=1543&type=bug) which avoids that /tmp/ocamlppXXXX ends up in Location.input_file, and thus into the resulting binary. Could you please try this patch on your reproducible infrastructure and report results here? If it works, this should be upstream the patches into OCaml [together with your other reproducible patches]. Thanks (to both Gabriel for his quick hacking, and for Holger for pushing me to finally start a conversation about reproducible builds), hannes-------------------------------------------------------
Hi, AFAIK, that patch was tested and broke "ocamldoc -pp", which caused FTBFS of mlpost (at least). See http://anonscm.debian.org/cgit/pkg-ocaml-maint/packages/ocaml.git/commit/?id=0f93e9ee91dfa37f2e6209c306fe8f2dbc46e540 Regards,
Hello, from the OCaml trenches: - gasche has pulled the OCamldoc patch upstream (see https://github.com/ocaml/ocaml/pull/321) - xavier has pushed a patch which emits a .file "" directive at the beginning of each assembly file (https://github.com/ocaml/ocaml/commit/eef84c432a4fcecc83f02d81b347cf819c69df9f) (discussion of why setting Location.input_file to the original file breaks is in http://caml.inria.fr/mantis/view.php?id=7037#c15059) It would be great to test xavier's patch in your setup, but it does not apply to 4.02.3 release (due to missing files). Please find attached a patch which cleanly applies to 4.02.3 (a merge of glondu@ and xavier's work). hannes