Bug 521515: Do not rely on commit date for reproducible builds

As mentioned in
https://devblogs.microsoft.com/oldnewthing/20180103-00/?p=97705,
Microsoft has stopped using the _IMAGE_FILE_HEADER.TimeDateStamp as a
time stamp and rather as a hash of the source files to make the build
result predictable.

Change-Id: I4f4a7b9557330e4c478ef7fb25653144c5b2d4ad
Signed-off-by: Torbjörn Svensson <azoff@svenskalinuxforeningen.se>
This commit is contained in:
Torbjörn Svensson 2020-08-21 15:33:46 +02:00 committed by Jonah Graham
parent 2f05a6348e
commit 07b50ba2a2
8 changed files with 99 additions and 49 deletions

View file

@ -122,27 +122,16 @@ The `native` property can be one of the following:
- `linux.x86_64` - uses local tools and builds only linux.x86_64 libraries
- `linux.ppc64le` - uses local tools and builds only linux.ppc64le libraries
- `docker` - uses CDT's docker releng images to do the native builds for all platforms
- `all` - uses local tools to do the native builds for all platforms
- `all` - uses local tools to do the native builds for all platforms
Therefore to build all the natives using docker do `mvn process-resources -Dnative=docker`.
Therefore to build all the natives using docker add `-Dnative=docker` to your maven command line (e.g. `mvn verify -Dnative=docker`).
However, the challenge is that dll files on Windows have a timestamp in them. To have reproducible builds, we need to have a reproducible timestamp. Therefore we use the commit time of the commit to derive a timestamp (We use the `SOURCE_DATE_EPOCH` environemnt variable to achieve this, see the [Makefile](native/org.eclipse.cdt.native.serial/native_src/Makefile) for more info). Because we want to keep the DLL checked in so that contributors don't need to rebuild it all the time we need a way to have to check in the dll with the same commit time. To do this we use GIT_COMMITTER_DATE. So, after editing and committing your change, you need to rebuild one last time with the commit date and the commit it without changing the commit date again using:
To build only the native libraries `mvn process-resources` can be used on the individual bundles with the simrel target platform, e.g.:
1. Edit and commit change
2. Set DIR to the name of the directory you are working on, e.g. `DIR=native/org.eclipse.cdt.native.serial`
3. `mvn process-resources -DuseSimrelRepo -Dnative=docker -f $DIR`
4. `git add -- $DIR`
5. `GIT_COMMITTER_DATE=$(git log -1 --pretty=format:%cI -- $DIR) git commit --amend --reuse-message=HEAD`
- Serial library: `mvn process-resources -Dnative=docker -DuseSimrelRepo -f native/org.eclipse.cdt.native.serial`
- Core library: `mvn process-resources -Dnative=docker -DuseSimrelRepo -f core/org.eclipse.cdt.core.native`
The example for the core native bundle is:
1. `DIR=core/org.eclipse.cdt.core.native`
2. `mvn process-resources -DuseSimrelRepo -Dnative=docker -f $DIR`
3. `git add -- core/org.eclipse.cdt.core.win32.x86_64/os/win32/x86_64`
4. `GIT_COMMITTER_DATE=$(git log -1 --pretty=format:%cI -- $DIR) git commit --amend --reuse-message=HEAD`
As a CDT contributor if you are having an issue recreating the above flow, please reach out on cdt-dev mailing list or in the bug/gerrit you submit. A CDT committer can help ensure the native libraries are correctly rebuilt.
However, the challenge is that dll files on Windows have a timestamp in them. To have reproducible builds, we need to have a reproducible timestamp. As [Microsoft](https://devblogs.microsoft.com/oldnewthing/20180103-00/?p=97705) has moved away from using a timestamp to rather use a hash of the source files as the value, we therefore hash the source files used by the library and the header files for the Java API and use that as the value.
An additional tip is to set the following in `.gitconfig` to allow you to diff `.dll` files. This will show the timestamp of the DLL in the diff as part of the DLL headers.

View file

@ -17,6 +17,8 @@ ifeq ($(JAVA_HOME),)
$(error JAVA_HOME not set in environment)
endif
REPRODUCIBLE_BUILD_WRAPPER := $(shell git rev-parse --show-toplevel)/releng/scripts/reproducible_build_wrapper.py
OS_DIR_WIN32_X86_64 := ../../org.eclipse.cdt.core.win32.x86_64/os/win32/x86_64
OS_DIR_LINUX_X86_64 := ../../org.eclipse.cdt.core.linux.x86_64/os/linux/x86_64
OS_DIR_LINUX_AARCH64 := ../../org.eclipse.cdt.core.linux.aarch64/os/linux/aarch64
@ -65,31 +67,30 @@ rebuild: clean all
# Windows x86_64
# Windows DLLs have a build timestamp in them. This makes it impossible to have reproducible builds.
# However, x86_64-w64-mingw32-ld on Debian/Ubuntu has a patch that overrides the current date
# using the SOURCE_DATE_EPOCH environment variable. Therefore we set SOURCE_DATE_EPOCH to a
# consistent timestamp that can be reproduced. We base it off of the commit timestamp of the
# most recent git commit in this directory.
# using the SOURCE_DATE_EPOCH environment variable. Call REPRODUCIBLE_BUILD_WRAPPER to make sure the
# same binary is produced for the same source each time.
$(OS_DIR_WIN32_X86_64)/starter.exe: win/starter.c
mkdir -p $(dir $@) && \
SOURCE_DATE_EPOCH=$(shell git log -1 --pretty=format:%ct -- .) \
$(REPRODUCIBLE_BUILD_WRAPPER) \
x86_64-w64-mingw32-gcc -o $@ -Iinclude -I"$(JAVA_HOME)/include" -I"$(JAVA_HOME)/include/win32" \
-DUNICODE \
win/starter.c \
$^ \
-lpsapi
$(OS_DIR_WIN32_X86_64)/spawner.dll: win/iostream.c win/raise.c win/spawner.c win/Win32ProcessEx.c
mkdir -p $(dir $@) && \
SOURCE_DATE_EPOCH=$(shell git log -1 --pretty=format:%ct -- .) \
$(REPRODUCIBLE_BUILD_WRAPPER) \
x86_64-w64-mingw32-gcc -o $@ -Iinclude -I"$(JAVA_HOME)/include" -I"$(JAVA_HOME)/include/win32" \
-DUNICODE \
win/iostream.c win/raise.c win/spawner.c win/Win32ProcessEx.c \
$^ \
-Wl,--kill-at --shared
$(OS_DIR_WIN32_X86_64)/pty.dll: win/pty.cpp win/pty_dllmain.cpp
mkdir -p $(dir $@) && \
SOURCE_DATE_EPOCH=$(shell git log -1 --pretty=format:%ct -- .) \
$(REPRODUCIBLE_BUILD_WRAPPER) \
x86_64-w64-mingw32-g++ -o $@ -Iinclude -Iwin/include -I"$(JAVA_HOME)/include" -I"$(JAVA_HOME)/include/win32" \
-DUNICODE \
win/pty.cpp win/pty_dllmain.cpp \
$^ \
-Wl,--kill-at --shared -L$(OS_DIR_WIN32_X86_64) -lwinpty -static-libstdc++ -static-libgcc
# Linux x86_64
@ -97,14 +98,14 @@ $(OS_DIR_LINUX_X86_64)/libspawner.so: unix/spawner.c unix/io.c unix/exec_unix.c
mkdir -p $(dir $@) && \
gcc -m64 -o $@ -Wl,-soname,$(notdir $@) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux -fpic \
-D_REENTRANT -D_GNU_SOURCE \
unix/spawner.c unix/io.c unix/exec_unix.c unix/exec_pty.c unix/openpty.c unix/pfind.c \
$^ \
-shared -lc
$(OS_DIR_LINUX_X86_64)/libpty.so: unix/openpty.c unix/pty.c unix/ptyio.c
mkdir -p $(dir $@) && \
gcc -m64 -o $@ -Wl,-soname,$(notdir $@) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux -fpic \
-D_REENTRANT -D_GNU_SOURCE \
unix/openpty.c unix/pty.c unix/ptyio.c \
$^ \
-shared -lc
# Linux aarch64
@ -112,14 +113,14 @@ $(OS_DIR_LINUX_AARCH64)/libspawner.so: unix/spawner.c unix/io.c unix/exec_unix.c
mkdir -p $(dir $@) && \
aarch64-linux-gnu-gcc -o $@ -Wl,-soname,$(notdir $@) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux -fpic \
-D_REENTRANT -D_GNU_SOURCE \
unix/spawner.c unix/io.c unix/exec_unix.c unix/exec_pty.c unix/openpty.c unix/pfind.c \
$^ \
-shared -lc
$(OS_DIR_LINUX_AARCH64)/libpty.so: unix/openpty.c unix/pty.c unix/ptyio.c
mkdir -p $(dir $@) && \
aarch64-linux-gnu-gcc -o $@ -Wl,-soname,$(notdir $@) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux -fpic \
-D_REENTRANT -D_GNU_SOURCE \
unix/openpty.c unix/pty.c unix/ptyio.c \
$^ \
-shared -lc
# Linux ppc64le
@ -127,14 +128,14 @@ $(OS_DIR_LINUX_PPC64LE)/libspawner.so: unix/spawner.c unix/io.c unix/exec_unix.c
mkdir -p $(dir $@) && \
gcc -m64 -o $@ -Wl,-soname,$(notdir $@) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux -fpic \
-D_REENTRANT -D_GNU_SOURCE \
unix/spawner.c unix/io.c unix/exec_unix.c unix/exec_pty.c unix/openpty.c unix/pfind.c \
$^ \
-shared -lc
$(OS_DIR_LINUX_PPC64LE)/libpty.so: unix/openpty.c unix/pty.c unix/ptyio.c
mkdir -p $(dir $@) && \
gcc -m64 -o $@ -Wl,-soname,$(notdir $@) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux -fpic \
-D_REENTRANT -D_GNU_SOURCE \
unix/openpty.c unix/pty.c unix/ptyio.c \
$^ \
-shared -lc
# macos x86_64
@ -142,14 +143,14 @@ $(OS_DIR_MACOS_X86_64)/libspawner.jnilib: unix/spawner.c unix/io.c unix/exec_uni
mkdir -p $(dir $@) && \
x86_64-apple-darwin17-clang -o $@ -arch x86_64 -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/darwin -fPIC \
-D_REENTRANT \
unix/spawner.c unix/io.c unix/exec_unix.c unix/exec_pty.c unix/openpty.c unix/pfind.c \
$^ \
-dynamiclib -lc -framework JavaVM
$(OS_DIR_MACOS_X86_64)/libpty.jnilib: unix/openpty.c unix/pty.c unix/ptyio.c
mkdir -p $(dir $@) && \
x86_64-apple-darwin17-clang -o $@ -arch x86_64 -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/darwin -fPIC \
-D_REENTRANT \
unix/openpty.c unix/pty.c unix/ptyio.c \
$^ \
-dynamiclib -lc -framework JavaVM
# macos x86
@ -157,12 +158,12 @@ $(OS_DIR_MACOS_X86)/libspawner.jnilib: unix/spawner.c unix/io.c unix/exec_unix.c
mkdir -p $(dir $@) && \
x86_64-apple-darwin17-clang -o $@ -arch i386 -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/darwin -fPIC \
-D_REENTRANT \
unix/spawner.c unix/io.c unix/exec_unix.c unix/exec_pty.c unix/openpty.c unix/pfind.c \
$^ \
-dynamiclib -lc -framework JavaVM
$(OS_DIR_MACOS_X86)/libpty.jnilib: unix/openpty.c unix/pty.c unix/ptyio.c
mkdir -p $(dir $@) && \
x86_64-apple-darwin17-clang -o $@ -arch i386 -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/darwin -fPIC \
-D_REENTRANT \
unix/openpty.c unix/pty.c unix/ptyio.c \
$^ \
-dynamiclib -lc -framework JavaVM

View file

@ -17,6 +17,8 @@ ifeq ($(JAVA_HOME),)
$(error Please define JAVA_HOME)
endif
REPRODUCIBLE_BUILD_WRAPPER := $(shell git rev-parse --show-toplevel)/releng/scripts/reproducible_build_wrapper.py
OS_DIR = ../os
CFLAGS += -fPIC -D_REENTRANT
@ -47,25 +49,25 @@ rebuild: clean all
# Windows DLLs have a build timestamp in them. This makes it impossible to have reproducible builds.
# However, x86_64-w64-mingw32-ld on Debian/Ubuntu has a patch that overrides the current date
# using the SOURCE_DATE_EPOCH environment variable. Therefore we set SOURCE_DATE_EPOCH to a
# consistent timestamp that can be reproduced. We base it off of the commit timestamp of the
# most recent git commit in this directory.
# using the SOURCE_DATE_EPOCH environment variable. Call REPRODUCIBLE_BUILD_WRAPPER to make sure the
# same binary is produced for the same source each time.
$(OS_DIR)/win32/x86_64/serial.dll: serial.c
mkdir -p $(dir $@)
SOURCE_DATE_EPOCH=$(shell git log -1 --pretty=format:%ct -- .) x86_64-w64-mingw32-gcc -Iinclude -I"$(JAVA_HOME)/include" -I"$(JAVA_HOME)/include/win32" -shared -o $@ serial.c
mkdir -p $(dir $@) && \
$(REPRODUCIBLE_BUILD_WRAPPER) \
x86_64-w64-mingw32-gcc -Iinclude -I"$(JAVA_HOME)/include" -I"$(JAVA_HOME)/include/win32" -shared -o $@ $^
$(OS_DIR)/linux/x86_64/libserial.so: serial.c
mkdir -p $(dir $@)
gcc -m64 $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux $(LDFLAGS) -shared -o $@ serial.c
mkdir -p $(dir $@) && \
gcc -m64 $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux $(LDFLAGS) -shared -o $@ $^
$(OS_DIR)/linux/aarch64/libserial.so: serial.c
mkdir -p $(dir $@)
aarch64-linux-gnu-gcc $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux $(LDFLAGS) -shared -o $@ serial.c
mkdir -p $(dir $@) && \
aarch64-linux-gnu-gcc $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux $(LDFLAGS) -shared -o $@ $^
$(OS_DIR)/linux/ppc64le/libserial.so: serial.c
mkdir -p $(dir $@)
gcc -m64 -mcpu=power8 $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux $(LDFLAGS) -shared -o $@ serial.c
mkdir -p $(dir $@) && \
gcc -m64 -mcpu=power8 $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux $(LDFLAGS) -shared -o $@ $^
$(OS_DIR)/macosx/x86_64/libserial.jnilib: serial.c
mkdir -p $(dir $@)
x86_64-apple-darwin17-clang $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/darwin $(LDFLAGS) -dynamiclib -o $@ serial.c
mkdir -p $(dir $@) && \
x86_64-apple-darwin17-clang $(CFLAGS) -Iinclude -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/darwin $(LDFLAGS) -dynamiclib -o $@ $^

View file

@ -0,0 +1,58 @@
#!/usr/bin/env python3
import sys
import os
import hashlib
import subprocess
LONG_MAX = (1 << 64) - 1
DEBUG = True
def usage(msg=None):
if msg:
print(msg)
print("Usage: {0} <gcc/g++ command>".format(sys.argv[0]))
sys.exit(1)
def debug(s):
if DEBUG:
print("{} {}".format(sys.argv[0], s))
compiler_command = sys.argv[1:]
if len(compiler_command) == 0:
usage()
# Hash all the source files and traverse any directories recursively
sha1 = hashlib.sha1()
# Hash the build command too
sha1.update(" ".join(compiler_command).encode())
debug("Compiler command hashed: {}".format(sha1.hexdigest()))
preprocess_command = [*compiler_command, "-E"]
# Remove any output file (needs to write to stdout)
try:
index = compiler_command.index("-o")
del preprocess_command[index:index+2]
except ValueError:
usage("Missing output compiler flag")
# Preprocess the source file(s)
debug("Preprocess cmd: {}".format(preprocess_command))
data = subprocess.check_output(preprocess_command)
# Hash the content
sha1.update(data)
debug("Content hashed: {}".format(sha1.hexdigest()))
# Set the SOURCE_DATE_EPOCH environment variable to the hash value
os.environ["SOURCE_DATE_EPOCH"] = str(int(sha1.hexdigest(), base=16) % LONG_MAX)
debug("SOURCE_DATE_EPOCH: {}".format(os.environ["SOURCE_DATE_EPOCH"]))
# Run the compiler with the environement variable set
subprocess.run(compiler_command)