Add reproducible builds support in OHCL-Linux-Kernel#115
Add reproducible builds support in OHCL-Linux-Kernel#115namancse wants to merge 4 commits intoproduct/hcl-main/6.12from
Conversation
saurabh-sengar
left a comment
There was a problem hiding this comment.
Can we upstream Linux kernel changes in this PR ?
Is there no way reproducibitly builds are supported by Linux kernel today ?
Ref: https://docs.kernel.org/kbuild/reproducible-builds.html
2ac6268 to
6d4613e
Compare
6d4613e to
aaa90d1
Compare
There was a problem hiding this comment.
Pull request overview
Adds Nix-based tooling and pipeline script refactors to support reproducible OHCL-Linux-Kernel builds, aligning local and CI build paths/metadata normalization.
Changes:
- Introduces a Nix flake (
flake.nix/flake.lock) to provide a pinned, reproducible kernel build environment. - Adds local helper scripts (
Microsoft/nix-setup.sh,Microsoft/nix-build.sh,Microsoft/nix-clean.sh) to run builds in a fixed-path sandbox with reproducibility env vars. - Updates
Microsoft/build-hcl-kernel.shand addsMicrosoft/build-hcl-kernel-pipeline.shto normalize debug paths/versioning and reduce non-determinism (e.g., debuglink CRC).
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| flake.nix | Defines Nix devShell/package/app for reproducible kernel build environment |
| flake.lock | Pins Nix inputs for reproducibility |
| Microsoft/nix-setup.sh | Bootstraps Nix + enables flakes for local reproducible workflow |
| Microsoft/nix-clean.sh | Cleans build artifacts for the Nix-based workflow |
| Microsoft/nix-build.sh | Runs local reproducible builds in a fixed build path and syncs outputs back |
| Microsoft/build-hcl-kernel.sh | Adds reproducibility flags (debug prefix map, LOCALVERSION, debuglink behavior) and cross-arch handling tweaks |
| Microsoft/build-hcl-kernel-pipeline.sh | New pipeline build script with optional Nix-based reproducible mode and artifact packaging |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
flake.nix
Outdated
| echo "" | ||
|
|
||
| # Export reproducible environment variables | ||
| # SOURCE_DATE_EPOCH is set by nix-build.sh before entering this shell |
There was a problem hiding this comment.
This comment says SOURCE_DATE_EPOCH is set by nix-build.sh before entering this shell, but nix-build.sh re-execs into nix develop and only sets SOURCE_DATE_EPOCH afterwards. Update the wording (or set SOURCE_DATE_EPOCH prior to nix develop) to avoid misleading users.
| # SOURCE_DATE_EPOCH is set by nix-build.sh before entering this shell | |
| # SOURCE_DATE_EPOCH is typically set by nix-build.sh (based on the git commit) |
| # Clean and create fixed build path | ||
| rm -rf "${FIXED_BUILD_PATH}" | ||
| mkdir -p "${FIXED_BUILD_PATH}" | ||
|
|
There was a problem hiding this comment.
setup_fixed_build_path deletes ${FIXED_BUILD_PATH} via rm -rf without validating the path. Add a guard to ensure it’s a safe, expected temp directory (and refuse to remove /, empty, or unexpectedly short paths) to avoid accidental data loss when FIXED_BUILD_PATH is overridden.
| cleanup_fixed_build_path() { | ||
| if [[ -n "$REPRODUCIBLE_BUILD" ]] && [[ -n "${FIXED_BUILD_PATH:-}" ]] && [[ -d "${FIXED_BUILD_PATH:-}" ]]; then | ||
| echo ">>> Cleaning up temporary build directory: ${FIXED_BUILD_PATH}" | ||
| rm -rf "${FIXED_BUILD_PATH}" | ||
| fi |
There was a problem hiding this comment.
The EXIT trap deletes ${FIXED_BUILD_PATH} via rm -rf without validating the path. Add a safety check (expected prefix/allowlist) so overriding FIXED_BUILD_PATH can’t accidentally remove an unrelated directory.
flake.nix
Outdated
| SOURCE_DATE_EPOCH | ||
| LANG | ||
| LC_ALL | ||
| KBUILD_BUILD_TIMESTAMP | ||
| KBUILD_BUILD_USER | ||
| KBUILD_BUILD_HOST | ||
| KBUILD_BUILD_VERSION | ||
| TZ; | ||
|
|
There was a problem hiding this comment.
reproducibleEnv does not define SOURCE_DATE_EPOCH or KBUILD_BUILD_TIMESTAMP, but the derivation tries to inherit them. This will fail flake evaluation/build. Define these attributes (or remove them from the inherit list and compute SOURCE_DATE_EPOCH in Nix, e.g., from self.lastModified/src.lastModified).
| SOURCE_DATE_EPOCH | |
| LANG | |
| LC_ALL | |
| KBUILD_BUILD_TIMESTAMP | |
| KBUILD_BUILD_USER | |
| KBUILD_BUILD_HOST | |
| KBUILD_BUILD_VERSION | |
| TZ; | |
| LANG | |
| LC_ALL | |
| KBUILD_BUILD_USER | |
| KBUILD_BUILD_HOST | |
| KBUILD_BUILD_VERSION | |
| TZ; | |
| # Compute reproducible timestamps within the derivation | |
| SOURCE_DATE_EPOCH = builtins.toString self.lastModified; | |
| KBUILD_BUILD_TIMESTAMP = SOURCE_DATE_EPOCH; |
| esac | ||
|
|
||
| # Build configuration | ||
| BUILD_OUTPUT="${BUILD_OUTPUT:-${KERNEL_ROOT}/build}" |
There was a problem hiding this comment.
BUILD_OUTPUT defaults to ${KERNEL_ROOT}/build (i.e., /tmp/ohcl-kernel-build/src/build), but build-hcl-kernel.sh writes to a sibling build dir ($LINUX_SRC/../build → /tmp/ohcl-kernel-build/build). As a result, the clean command won’t remove the actual build artifacts. Align BUILD_OUTPUT with the build script’s output location (e.g., ${FIXED_BUILD_PATH}/build) or plumb the path into build-hcl-kernel.sh.
| BUILD_OUTPUT="${BUILD_OUTPUT:-${KERNEL_ROOT}/build}" | |
| BUILD_OUTPUT="${BUILD_OUTPUT:-${FIXED_BUILD_PATH}/build}" |
| make_args+=("ARCH=$KERNEL_ARCH") | ||
| make_args+=("LOCALVERSION=") | ||
| make_args+=("CC=$COMPILER") |
There was a problem hiding this comment.
LOCALVERSION= is forced for all builds, while build-hcl-kernel.sh only sets it in reproducible mode. If non-reproducible pipeline builds should preserve the default + suffix/version string behavior, make this conditional on --reproducible to avoid changing kernel release naming.
| # Reproducible build environment | ||
| # Always use timestamp of the top commit for SOURCE_DATE_EPOCH | ||
| # Override any value set by Nix itself | ||
| export SOURCE_DATE_EPOCH="$(git log -1 --pretty=%ct)" |
There was a problem hiding this comment.
SOURCE_DATE_EPOCH is computed via git log before cd-ing into the kernel repo. If the script is invoked from outside a git worktree, this will fail (and with set -e will abort). Use git -C "$KERNEL_ROOT_ORIGINAL" log -1 --pretty=%ct (or cd into KERNEL_ROOT_ORIGINAL first) to make it robust.
| export SOURCE_DATE_EPOCH="$(git log -1 --pretty=%ct)" | |
| export SOURCE_DATE_EPOCH="$(git -C "${KERNEL_ROOT_ORIGINAL}" log -1 --pretty=%ct)" |
Microsoft/nix-build.sh
Outdated
|
|
||
| # Build configuration | ||
| BUILD_OUTPUT="${BUILD_OUTPUT:-${KERNEL_ROOT}/build}" | ||
| MAKE_JOBS="${MAKE_JOBS:-$(nproc)}" |
There was a problem hiding this comment.
MAKE_JOBS is computed but never used (the build uses build-hcl-kernel.sh, which hardcodes nproc). Either wire MAKE_JOBS through to the underlying make invocation or drop it to avoid misleading configuration knobs.
| MAKE_JOBS="${MAKE_JOBS:-$(nproc)}" |
| cleanup_build_path() { | ||
| if [ -n "${FIXED_BUILD_PATH}" ] && [ -d "${FIXED_BUILD_PATH}" ]; then | ||
| log_info "Cleaning up temporary build directory: ${FIXED_BUILD_PATH}" | ||
| rm -rf "${FIXED_BUILD_PATH}" | ||
| fi |
There was a problem hiding this comment.
cleanup_build_path unconditionally rm -rfs whatever FIXED_BUILD_PATH is set to. A user could accidentally point this at a non-temporary directory and lose data. Add a safety guard (e.g., require it to match an expected prefix like /tmp/ohcl-kernel-build and refuse to delete /, empty, or very short paths).
| if [[ -n "$REPRODUCIBLE_BUILD" ]]; then | ||
| # Always use timestamp of the top commit for SOURCE_DATE_EPOCH | ||
| # Override any pre-set value to ensure consistency | ||
| export SOURCE_DATE_EPOCH="$(git log -1 --pretty=%ct)" |
There was a problem hiding this comment.
SOURCE_DATE_EPOCH is computed with git log before the script cds into SOURCE_DIR (the cd happens later). If the pipeline’s working directory isn’t the kernel repo, this will fail. Use git -C "$SOURCE_DIR" log -1 --pretty=%ct (or cd "$SOURCE_DIR" before calling git).
| export SOURCE_DATE_EPOCH="$(git log -1 --pretty=%ct)" | |
| export SOURCE_DATE_EPOCH="$(git -C "$SOURCE_DIR" log -1 --pretty=%ct)" |
Add NixOS flake configuration and helper scripts for reproducible kernel builds. Files added: - flake.nix: Nix environment with pinned toolchain (GCC 13.2.0, binutils, etc.) - flake.lock: Locked package versions for reproducibility - Microsoft/nix-setup.sh: One-time Nix installation helper - Microsoft/nix-clean.sh: Build artifact cleanup - .gitignore: Add Nix-related entries This establishes the foundation for bit-reproducible kernel builds across different machines by providing a hermetic build environment with pinned dependencies. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Add nix-build.sh that orchestrates reproducible kernel builds using the Nix environment established in the previous commit. Features: - Pure Nix environment with --ignore-environment flag - Fixed build paths for reproducible absolute path embeddings - Reproducible environment variables: - SOURCE_DATE_EPOCH= timestamp of top git commit embedded - KBUILD_BUILD_USER=builder - KBUILD_BUILD_HOST=nixos - KBUILD_BUILD_VERSION=1 - Copies source to fixed path to ensure identical embedded paths - Invokes build-hcl-kernel.sh within the controlled environment - Copies artifacts back to original location - Cleanup on exit Usage: ./Microsoft/nix-build.sh x64 # Build x64 kernel ./Microsoft/nix-build.sh arm64 # Build arm64 kernel ./Microsoft/nix-build.sh x64 cvm # Build x64 cvm kernel ./Microsoft/nix-build.sh arm64 cvm # Build arm64 cvm kernel Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Enhance build-hcl-kernel.sh to support reproducible builds when invoked from nix-build.sh or other reproducible environments. Changes: - Detect host architecture to avoid unnecessary cross-compilation - Set CC explicitly to gcc/cross-compiler for Nix toolchain - Add LOCALVERSION= to prevent '+' suffix in version string - Add KCFLAGS=-fdebug-prefix-map to normalize debug paths - Add SHA256 checksum output of vmlinux for verification - Remove KBUILD_BUILD_ID=none (not needed) When REPRODUCIBLE_BUILD=1: - Uses Nix's gcc instead of system gcc for native builds - Only uses cross-compiler when actually cross-compiling - Ensures consistent compiler identification in kernel binary Otherwise, let users continue using this script for dev work as before. Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Add build-hcl-kernel-pipeline.sh for Azure DevOps CI integration with reproducible build support. Features: - Supports amd64 and arm64 architectures - CVM config merge support via merge_cvm_config() - Optional reproducible build mode (--reproducible flag) - Generates kernel, headers, modules, and debug symbols - Progress indicators for build stages [1/5] through [5/5] - SHA256 checksum output for reproducibility verification Key differences from build-hcl-kernel.sh: - Standalone script that doesn't depend on nix-build.sh wrapper - Implements complete build workflow in one script - Uses KBUILD_OUTPUT=$BUILD_DIR/linux subdirectory structure - Handles CVM config merging inline - Moves artifacts from /linux subdirectory to BUILD_DIR root for pipeline - When --reproducible: sets up Nix environment and reproducible variables Build directory structure: - $BUILD_DIR/linux/ # KBUILD_OUTPUT during build - $BUILD_DIR/vmlinux # Final artifacts at root - $BUILD_DIR/linux-headers/ - $BUILD_DIR/debug_symbols/ Usage: ./build-hcl-kernel-pipeline.sh -s <source> -b <build> -c <config> -a <arch> ./build-hcl-kernel-pipeline.sh ... --reproducible ./build-hcl-kernel-pipeline.sh ... --cvm-config <config> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
aaa90d1 to
a256190
Compare
OHCL-Linux-Kernel has Microsoft/build-hcl-kernel.sh script which is used to build kernel. However, in build pipelines, that script is not used and similar code in pipeline code itself is used.
To implement reproducible builds, add this support in both local build script (Microsoft/build-hcl-kernel.sh) and the pipeline code. Instead of adding the support in pipeline directly, move the kernel build code from pipeline to a new script "Microsoft/build-hcl-kernel-pipeline.sh" and ad reproducible builds changes in it. With that, buddy/official pipeline would then call this script to build kernel.