From f318585dc4d13332dc940138544b5af9981470c3 Mon Sep 17 00:00:00 2001 From: Bernardo Magri Date: Thu, 30 Apr 2026 19:42:00 +0100 Subject: [PATCH] fix(installer): harden disk selection and partitioning phase MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The disk phase was the dominant source of incomplete installs. Six concrete failure modes addressed in one pass: 1. Live-ISO USB excluded from the disk picker. select_disk previously filtered loop|ram|zram|sr but not the device the installer booted from; picking it would format the boot media mid-install. New detect_live_iso_devices walks /, /iso, /run/initramfs/live, /nix/.ro-store, /nix/store and resolves each backing device to its parent disk via lsblk -no PKNAME. Override with NOMARCHY_INSTALL_ALLOW_ISO_TARGET=1 for the developer case. 2. 10 GiB minimum-capacity preflight. Disko fails late and obscurely on undersized media; surface it while the picker is still open. 3. prewipe_target_drive rewritten: - Enumerates every active dm-crypt mapping via dmsetup ls and closes those whose backing device is on the target drive. The old version only knew about the hardcoded names "crypted" / "crypted_main" so an aborted multi-disk run or a non-Nomarchy install would leave a holder open and silently break the wipe. - Drops `|| true` from wipefs / sgdisk / dd. After the LUKS and swap teardown above, a real failure means something is still holding the device — surface that instead of papering over it. - udevadm settle bounded to 30s so a flapping USB can't hang. - Post-wipe sanity check: refuse to hand the disk to disko if anything is still mounted off it. 4. run_disko_with_retry wraps the disko call. On failure, shows the last 30 lines of output via gum style and offers Retry / View full log / Abort. set -e is suspended for the disko call so the exit code can be inspected. The previous bare `disko --mode disko` aborted the whole installer with output scrolled past. 5. Sed-templated disko-golden.nix + disko-btrfs-multi.nix pair replaced by a single disko-config.nix Nix function of { mainDrive, extraDrives ? [] } called via --argstr / --arg. Templating Nix via shell-escaped string substitution caused at least one production bug (3aadc36 fixed embedded-newline escaping); function arguments are the right shape and eliminate the entire class of escaping concerns. Single-disk path is `extraDrives = []`; multi-disk gets BTRFS `-d single -m raid1` plus the additional /dev/mapper/* devices. Hosts that shipped /etc/disko-golden.nix now ship /etc/disko-config.nix. 6. EXIT trap added so the tmpfs LUKS key file (/dev/shm/nomarchy- luks.key) is removed even if the script aborts between key-write and the explicit unset. Replaced redundant `shred -u` on tmpfs with `rm -f` (already in RAM). Verification: bash -n on install.sh, nix-instantiate parse + strict eval on disko-config.nix in both single and multi shapes, full nix flake check --no-build evaluating all three NixOS configurations (default, nomarchy-installer, nomarchy-live) plus the installerVm. Co-Authored-By: Claude Opus 4.7 --- docs/MIGRATION.md | 2 +- docs/ROADMAP.md | 1 + docs/STRUCTURE.md | 5 +- hosts/nomarchy-installer.nix | 4 +- hosts/nomarchy-live.nix | 4 +- installer/disko-btrfs-multi.nix | 76 ---------- installer/disko-config.nix | 127 ++++++++++++++++ installer/disko-golden.nix | 105 -------------- installer/install.sh | 247 +++++++++++++++++++++++--------- 9 files changed, 317 insertions(+), 254 deletions(-) delete mode 100644 installer/disko-btrfs-multi.nix create mode 100644 installer/disko-config.nix delete mode 100644 installer/disko-golden.nix diff --git a/docs/MIGRATION.md b/docs/MIGRATION.md index 086acd4..b3ceb35 100644 --- a/docs/MIGRATION.md +++ b/docs/MIGRATION.md @@ -232,7 +232,7 @@ nomarchy-test-live-iso # boots the ISO in QEMU to evaluate The ISO autologins to a Hyprland live session that points you at: - `sudo /etc/install.sh` — install (BTRFS + LUKS + subvolumes per - `installer/disko-golden.nix`, auto-detects hardware via `hardware-db.sh`, + `installer/disko-config.nix`, auto-detects hardware via `hardware-db.sh`, runs `home-manager switch` inside `nixos-enter` so the first login is fully themed). - `sudo /etc/install.sh --dry-run` — generate the flake into a tmpdir and diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index 25ebe2d..87b5532 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -136,6 +136,7 @@ Nomarchy is moving away from being a "flavor" of Omarchy to its own distinct ide (Move items here when they land — keep them brief, link the commit/PR.) +- _2026-04-30_ — Installer disk-phase reliability. Hardened `installer/install.sh` and consolidated the disko configs: (1) `select_disk` now hides the live-ISO boot device(s) so the installer can't format its own boot media (`NOMARCHY_INSTALL_ALLOW_ISO_TARGET=1` to override); (2) added a 10 GiB minimum-capacity preflight; (3) `prewipe_target_drive` enumerates every active dm-crypt mapping backed by the target drive and closes them, drops the silent `|| true` from `wipefs`/`sgdisk`/`dd`, bounds `udevadm settle` to 30s, and refuses to continue if anything is still mounted; (4) wrapped the disko call in `run_disko_with_retry` with last-30-lines + Retry / View full log / Abort dialog on failure; (5) replaced the sed-templated `disko-golden.nix` + `disko-btrfs-multi.nix` pair with a single `disko-config.nix` Nix function called via `--argstr mainDrive … --arg extraDrives '[…]'` — eliminates a class of escaping bugs (cf. `3aadc36`); (6) added an EXIT trap so the tmpfs LUKS key file is removed even on early abort. - _2026-04-30_ — Gaming home-side companion. New `nomarchy.gaming.enable` option (mirror of `nomarchy.system.gaming.enable`) and `core/home/gaming.nix` module that injects a Hyprland `windowrulev2 = fullscreen, class:^(steam_app_).*$` so Steam-launched games grab the whole screen. Closes the "Gaming — Hyprland window rule" Next-column row. - _2026-04-26_ — Default to highest resolution (`highres`) for monitors. Updated `features/desktop/hyprland/config/monitors.conf` and forced it in the live ISO (`nomarchy-live`) to resolve issues where some hardware would default to a low resolution (1024x768). - _2026-04-26_ — First-run welcome wizard (`nomarchy-welcome`). Extended from a one-shot greeter into a guided picker for theme, font, and panel position. Added Step 4 to generate a starter `home.nix` if missing. State is now persisted in `state.json` via `.welcome_done`. Added `nomarchy.panelPosition` option to Waybar. diff --git a/docs/STRUCTURE.md b/docs/STRUCTURE.md index c3f1f0e..b8a0a56 100644 --- a/docs/STRUCTURE.md +++ b/docs/STRUCTURE.md @@ -125,9 +125,8 @@ The `lib/` directory provides centralized logic and data structures to maintain ### `installer/` (Bootstrap) - **`install.sh`**: The interactive TTY-based installer. It handles disk partitioning, NixOS installation, and generating a clean "Downstream" flake for the user. -- **`disko-golden.nix`**: The standard partition layout (BTRFS on top of LUKS2). -- **`disko-btrfs-multi.nix`**: Multi-disk BTRFS RAID/Single layout template. -- **`disko-btrfs-luks.nix`**: A simpler reference layout for disk management. +- **`disko-config.nix`**: The disko partition layout (BTRFS on top of LUKS2). A Nix function of `{ mainDrive, extraDrives ? [] }` — single-disk path is `extraDrives = []`; multi-disk adds BTRFS `-d single -m raid1` across the extras. Invoked by `install.sh` via `disko --argstr mainDrive … --arg extraDrives '[…]'`. +- **`disko-btrfs-luks.nix`**: A simpler reference layout for disk management (not used by the installer). ### `hosts/` (Targets) - **`nomarchy-installer.nix`**: Configuration for the minimal, TTY-based installation ISO. diff --git a/hosts/nomarchy-installer.nix b/hosts/nomarchy-installer.nix index d5b6e79..025d7dd 100644 --- a/hosts/nomarchy-installer.nix +++ b/hosts/nomarchy-installer.nix @@ -96,8 +96,8 @@ # Symlink for easy access (merged into systemPackages above) # The nomarchy-install script is created by writeShellScriptBin in the main list - # Include disko configurations - environment.etc."disko-golden.nix".source = ../installer/disko-golden.nix; + # Include disko configuration + environment.etc."disko-config.nix".source = ../installer/disko-config.nix; # Include Nomarchy source for installation environment.etc."nomarchy".source = inputs.self; diff --git a/hosts/nomarchy-live.nix b/hosts/nomarchy-live.nix index d317a45..8e5876d 100644 --- a/hosts/nomarchy-live.nix +++ b/hosts/nomarchy-live.nix @@ -58,8 +58,8 @@ mode = "0644"; }; - environment.etc."disko-golden.nix" = { - source = ../installer/disko-golden.nix; + environment.etc."disko-config.nix" = { + source = ../installer/disko-config.nix; }; environment.etc."nomarchy".source = inputs.self; diff --git a/installer/disko-btrfs-multi.nix b/installer/disko-btrfs-multi.nix deleted file mode 100644 index 5f969b7..0000000 --- a/installer/disko-btrfs-multi.nix +++ /dev/null @@ -1,76 +0,0 @@ -{ - disko.devices = { - disk = { - main = { - type = "disk"; - device = "@MAIN_DRIVE@"; - content = { - type = "gpt"; - partitions = { - ESP = { - priority = 1; - name = "ESP"; - start = "1M"; - end = "1G"; - type = "EF00"; - content = { - type = "filesystem"; - format = "vfat"; - mountpoint = "/boot"; - mountOptions = [ "umask=0077" ]; - }; - }; - luks = { - size = "100%"; - content = { - type = "luks"; - name = "crypted_main"; - settings = { - allowDiscards = true; - passwordFile = "/dev/shm/nomarchy-luks.key"; - }; - content = { - type = "btrfs"; - extraArgs = [ "-f" "-d single" "-m raid1" @BTRFS_DEVICES@ ]; - subvolumes = { - "@" = { - mountpoint = "/"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - "@persist" = { - mountpoint = "/persist"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - "@home" = { - mountpoint = "/home"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - "@nix" = { - mountpoint = "/nix"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - "@log" = { - mountpoint = "/var/log"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - "@snapshots" = { - mountpoint = "/.snapshots"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - }; - postCreateHook = '' - MNTPOINT=$(mktemp -d) - mount -t btrfs /dev/mapper/crypted_main $MNTPOINT - btrfs subvolume snapshot -r $MNTPOINT/@ $MNTPOINT/root-blank - umount $MNTPOINT - ''; - }; - }; - }; - }; - }; - }; - @ADDITIONAL_DISKS@ - }; - }; -} diff --git a/installer/disko-config.nix b/installer/disko-config.nix new file mode 100644 index 0000000..7388018 --- /dev/null +++ b/installer/disko-config.nix @@ -0,0 +1,127 @@ +# Nomarchy Golden-Path Disk Configuration +# +# Single source of truth for the installer's disko layout. The single-disk +# and multi-disk paths differ only in (a) whether `extraDrives` is empty and +# (b) the BTRFS profile (multi adds `-d single -m raid1` plus the additional +# /dev/mapper/* devices). Pass arguments via: +# +# disko --argstr mainDrive /dev/nvme0n1 \ +# --arg extraDrives '[]' \ +# disko-config.nix +# +# disko --argstr mainDrive /dev/nvme0n1 \ +# --arg extraDrives '[ "/dev/sdb" "/dev/sdc" ]' \ +# disko-config.nix +# +# Replaces the previous sed-templated disko-golden.nix + disko-btrfs-multi.nix +# pair. Templating Nix via shell-escaped string substitution proved fragile +# (commit 3aadc36 fixed one escaping bug; another was waiting to happen) — +# function arguments are the right shape. +{ mainDrive +, extraDrives ? [] +}: + +let + hasExtras = extraDrives != []; + + # Sanitize a device path into something usable as both a disko attr name + # and a /dev/mapper/ name. /dev/sdb -> dev_sdb, /dev/nvme0n2 -> dev_nvme0n2. + sanitize = path: builtins.replaceStrings [ "/" "-" ] [ "_" "_" ] path; + + extraName = drive: "extra_" + sanitize drive; + extraLuks = drive: "crypted_" + sanitize drive; + + mkExtraDisk = drive: { + name = extraName drive; + value = { + type = "disk"; + device = drive; + content = { + type = "gpt"; + partitions.luks = { + size = "100%"; + content = { + type = "luks"; + name = extraLuks drive; + settings.allowDiscards = true; + settings.passwordFile = "/dev/shm/nomarchy-luks.key"; + content.type = "btrfs"; + }; + }; + }; + }; + }; + + # BTRFS extraArgs: + # - single: just `-f` (force) — no need to enumerate devices, the FS is + # created on the one /dev/mapper/crypted device disko emits. + # - multi: `-f -d single -m raid1 ` — data striped + # across devices for capacity, metadata mirrored for safety. + btrfsExtraArgs = + if hasExtras then + [ "-f" "-d" "single" "-m" "raid1" ] + ++ map (d: "/dev/mapper/" + extraLuks d) extraDrives + else + [ "-f" ]; + + # The main LUKS mapping name varies between layouts so the postCreateHook + # (which mounts the freshly created BTRFS to take the impermanence-rollback + # snapshot) targets the right /dev/mapper entry. + mainLuksName = if hasExtras then "crypted_main" else "crypted"; + + rootBtrfs = { + type = "btrfs"; + extraArgs = btrfsExtraArgs; + subvolumes = { + "@" = { mountpoint = "/"; mountOptions = [ "compress=zstd" "noatime" ]; }; + "@persist" = { mountpoint = "/persist"; mountOptions = [ "compress=zstd" "noatime" ]; }; + "@home" = { mountpoint = "/home"; mountOptions = [ "compress=zstd" "noatime" ]; }; + "@nix" = { mountpoint = "/nix"; mountOptions = [ "compress=zstd" "noatime" ]; }; + "@log" = { mountpoint = "/var/log"; mountOptions = [ "compress=zstd" "noatime" ]; }; + "@snapshots" = { mountpoint = "/.snapshots"; mountOptions = [ "compress=zstd" "noatime" ]; }; + }; + postCreateHook = '' + MNTPOINT=$(mktemp -d) + mount -t btrfs /dev/mapper/${mainLuksName} $MNTPOINT + btrfs subvolume snapshot -r $MNTPOINT/@ $MNTPOINT/root-blank + umount $MNTPOINT + ''; + }; + +in { + disko.devices.disk = { + main = { + type = "disk"; + device = mainDrive; + content = { + type = "gpt"; + partitions = { + # 1 GiB ESP — fits several kernel generations + initrd + Plymouth. + ESP = { + priority = 1; + name = "ESP"; + start = "1M"; + end = "1G"; + type = "EF00"; + content = { + type = "filesystem"; + format = "vfat"; + mountpoint = "/boot"; + mountOptions = [ "umask=0077" ]; + }; + }; + luks = { + size = "100%"; + content = { + type = "luks"; + name = mainLuksName; + settings.allowDiscards = true; + settings.passwordFile = "/dev/shm/nomarchy-luks.key"; + content = rootBtrfs; + }; + }; + }; + }; + }; + } // builtins.listToAttrs (map mkExtraDisk extraDrives); +} diff --git a/installer/disko-golden.nix b/installer/disko-golden.nix deleted file mode 100644 index 56d1c10..0000000 --- a/installer/disko-golden.nix +++ /dev/null @@ -1,105 +0,0 @@ -# Nomarchy Golden Path Disk Configuration -# -# BTRFS + LUKS2 encryption with subvolumes optimized for: -# - Compression (zstd) -# - SSD optimization (noatime) -# - Impermanence support (root-blank snapshot) -# - Separate subvolumes for home, nix store, logs -# -# Replace @TARGET_DRIVE@ with the target device (e.g., /dev/nvme0n1) - -{ - disko.devices = { - disk = { - main = { - type = "disk"; - device = "@TARGET_DRIVE@"; - content = { - type = "gpt"; - partitions = { - # EFI System Partition. 1 GiB leaves room for several kernel - # generations + initrd + Plymouth assets without filling up. - ESP = { - priority = 1; - name = "ESP"; - start = "1M"; - end = "1G"; - type = "EF00"; - content = { - type = "filesystem"; - format = "vfat"; - mountpoint = "/boot"; - mountOptions = [ "umask=0077" ]; - }; - }; - - # LUKS-encrypted root partition. The installer writes the - # passphrase to an in-memory tmpfs (/dev/shm/nomarchy-luks.key) - # rather than the spinning /tmp so the secret never touches disk. - luks = { - size = "100%"; - content = { - type = "luks"; - name = "crypted"; - settings = { - allowDiscards = true; # Enable TRIM for SSDs - passwordFile = "/dev/shm/nomarchy-luks.key"; - }; - content = { - type = "btrfs"; - extraArgs = [ "-f" ]; # Force creation - subvolumes = { - # Root filesystem - "@" = { - mountpoint = "/"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - - # Persistent storage (for impermanence) - "@persist" = { - mountpoint = "/persist"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - - # User home directories - "@home" = { - mountpoint = "/home"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - - # Nix store (separate for better deduplication) - "@nix" = { - mountpoint = "/nix"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - - # System logs - "@log" = { - mountpoint = "/var/log"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - - # Snapshots — kept off the rolled-back root so tools like - # snapper / btrbk / nomarchy-rollback have a stable home. - "@snapshots" = { - mountpoint = "/.snapshots"; - mountOptions = [ "compress=zstd" "noatime" ]; - }; - }; - - # Create a read-only snapshot of root for impermanence rollback - postCreateHook = '' - MNTPOINT=$(mktemp -d) - mount -t btrfs /dev/mapper/crypted $MNTPOINT - btrfs subvolume snapshot -r $MNTPOINT/@ $MNTPOINT/root-blank - umount $MNTPOINT - ''; - }; - }; - }; - }; - }; - }; - }; - }; -} diff --git a/installer/install.sh b/installer/install.sh index 01b2346..bd54f0a 100755 --- a/installer/install.sh +++ b/installer/install.sh @@ -233,6 +233,38 @@ check_environment() { # STEP 2: DISK SELECTION # ============================================================================ +# Resolve the block device(s) backing the running live ISO so the disk +# picker can hide them. Picking the live USB by mistake destroys the +# installer's own boot media mid-run — always the worst-case outcome. +# We walk the live-ISO mountpoints (NixOS live ISO uses /iso for the +# squashfs source plus an overlay at /), resolve each to its parent +# disk via `lsblk -no PKNAME`, and emit a deduped list of /dev/ +# entries on stdout. Nothing emitted = no live-ISO devices detected +# (e.g. running the installer from a regular shell during development). +detect_live_iso_devices() { + local seen=" " + local mp src parent + for mp in / /iso /run/initramfs/live /nix/.ro-store /nix/store; do + src=$(findmnt -no SOURCE "$mp" 2>/dev/null) || continue + [[ "$src" == /dev/* ]] || continue + parent=$(lsblk -no PKNAME "$src" 2>/dev/null | head -n1) + if [[ -n "$parent" ]]; then + parent="/dev/$parent" + else + parent="$src" + fi + case "$seen" in + *" $parent "*) ;; + *) seen+="$parent "; printf '%s\n' "$parent" ;; + esac + done +} + +# Minimum total capacity across all picked drives. 10 GiB is the smallest +# size where the install completes without immediate disk-pressure failures +# (1 GiB ESP + ~5 GiB nix closure + working set). +_MIN_INSTALL_BYTES=$((10 * 1024 * 1024 * 1024)) + select_disk() { section "Disk Selection" @@ -245,8 +277,30 @@ select_disk() { # Columns: NAME, SIZE, TYPE (NVMe/USB/SSD/HDD), VENDOR, MODEL, SERIAL. # Empty fields render as "--" so column -t can still align them. local raw rows="" + + # Filter out pseudo-devices and the live-ISO boot media. The boot-media + # filter is the important one: without it the user can pick the USB + # they booted from and the installer will format its own boot device + # mid-run. NOMARCHY_INSTALL_ALLOW_ISO_TARGET=1 disables this guard + # for the rare case someone genuinely wants to install onto the same + # device (e.g. a developer testing in a VM without a second disk). + local exclude_re='^(/dev/(loop|ram|zram|sr))' + local live_devices=() + if [[ "${NOMARCHY_INSTALL_ALLOW_ISO_TARGET:-0}" != "1" ]]; then + mapfile -t live_devices < <(detect_live_iso_devices) + local d + for d in "${live_devices[@]}"; do + [[ -n "$d" ]] || continue + # Anchor to end-of-line so /dev/sda doesn't also match /dev/sdaa. + exclude_re+="|^${d}$" + done + if (( ${#live_devices[@]} > 0 )); then + info "Excluding live-ISO device(s) from picker: ${live_devices[*]}" + fi + fi + raw=$(lsblk -d -n -p -o NAME,SIZE,ROTA,TRAN,VENDOR,MODEL,SERIAL 2>/dev/null \ - | grep -vE '^(/dev/(loop|ram|zram|sr))') + | grep -vE "$exclude_re") while IFS= read -r line; do if [[ -z "$line" ]]; then continue; fi @@ -312,6 +366,21 @@ select_disk() { fi if [[ "$DRY_RUN" != "true" ]]; then + # Total-capacity preflight. Disko fails late and obscurely on + # undersized media; surface it here while the picker is still open. + local total_bytes=0 sz d + for d in $TARGET_DRIVE; do + sz=$(lsblk -bdno SIZE "$d" 2>/dev/null) || sz=0 + total_bytes=$((total_bytes + sz)) + done + if (( total_bytes < _MIN_INSTALL_BYTES )); then + local human + human=$(numfmt --to=iec --suffix=B "$total_bytes" 2>/dev/null || echo "${total_bytes} B") + error "Total target capacity is $human; Nomarchy needs at least 10 GiB." + TARGET_DRIVE="" + return 130 + fi + echo "" nrun gum style --foreground 9 --bold "⚠ WARNING: All data on $TARGET_DRIVE will be DESTROYED!" echo "" @@ -951,32 +1020,114 @@ prewipe_target_drive() { info "Pre-wiping $drive (clearing stale signatures)..." - # Tear down anything a prior aborted run left active. + # Tear down anything a prior aborted run left active. Order matters: + # mount holders -> swap -> LUKS mappings -> wipe. umount -R /mnt 2>/dev/null || true - cryptsetup close crypted 2>/dev/null || true swapoff -a 2>/dev/null || true + # Enumerate every active dm-crypt mapping and close those whose backing + # device is on this drive. The previous version only knew about the + # hardcoded names "crypted" and "crypted_main"; an aborted multi-disk + # run, a manual experiment, or a non-Nomarchy install would leave a + # mapping with a different name holding the device busy and silently + # break the wipe. + if command -v dmsetup >/dev/null 2>&1; then + local name backing + while read -r name _; do + [[ -n "$name" && "$name" != "No" ]] || continue # "No devices found" + backing=$(cryptsetup status "$name" 2>/dev/null \ + | awk '/^[[:space:]]*device:/ { print $2; exit }') || continue + [[ -n "$backing" ]] || continue + if [[ "$backing" == "$drive" || "$backing" == "${drive}"* ]]; then + info " Closing stale LUKS mapping: $name (backed by $backing)" + cryptsetup close "$name" + fi + done < <(dmsetup ls --target crypt 2>/dev/null) + fi + + # Wipe partition signatures. No `|| true` — the LUKS/swap teardown + # above should have released every holder; if wipefs still fails the + # device is genuinely busy and we want to surface that, not silently + # paper over it and let disko fail later with a confusing blkid error. local part - if compgen -G "${drive}*" >/dev/null; then + if compgen -G "${drive}?*" >/dev/null; then for part in "${drive}"?*; do [[ -b "$part" ]] || continue - wipefs -af "$part" >/dev/null 2>&1 || true + wipefs -af "$part" >/dev/null done fi - wipefs -af "$drive" >/dev/null 2>&1 || true - - sgdisk --zap-all "$drive" >/dev/null 2>&1 || true + wipefs -af "$drive" >/dev/null + sgdisk --zap-all "$drive" >/dev/null # 16 MiB covers LUKS2 binary headers (0–4 MiB) and the BTRFS first # superblock (64 KiB) — wipefs alone misses damaged variants of these. - dd if=/dev/zero of="$drive" bs=1M count=16 conv=fsync status=none 2>/dev/null || true + dd if=/dev/zero of="$drive" bs=1M count=16 conv=fsync status=none partprobe "$drive" 2>/dev/null || true - udevadm settle + # Bound the settle so a flapping USB device can't hang the installer. + udevadm settle --timeout=30 || info "udevadm settle timed out; continuing." + + # Sanity check: nothing should still be mounted off this drive after + # the wipe. If something is, refuse to hand the disk to disko. + if lsblk -no MOUNTPOINTS "$drive" 2>/dev/null | grep -qE '\S'; then + error "Drive $drive still has active mountpoints after pre-wipe." + error "Investigate with: lsblk $drive ; mount | grep $drive" + return 1 + fi success "Pre-wipe complete" } +_LUKS_KEY_PATH="/dev/shm/nomarchy-luks.key" + +# Wrap the disko invocation so a failure surfaces the last few lines of +# output and offers Retry / View full log / Abort. set -e is suspended for +# the disko call so we can inspect its exit code; restored on every path. +run_disko_with_retry() { + local main_drive="$1" + local extras_nix="$2" + local disko_file="$NOMARCHY_REPO/installer/disko-config.nix" + local log + log=$(mktemp --suffix=.disko.log) + + while true; do + local rc=0 + set +e + disko --mode disko \ + --argstr mainDrive "$main_drive" \ + --arg extraDrives "$extras_nix" \ + "$disko_file" 2>&1 | tee "$log" + rc=${PIPESTATUS[0]} + set -e + + if [[ $rc -eq 0 ]]; then + rm -f "$log" + return 0 + fi + + error "disko failed (exit $rc). Last lines of output:" + tail -n 30 "$log" | nrun gum style --foreground 9 --border normal --padding "0 1" + + local choice + choice=$(printf 'Retry\nView full log\nAbort\n' \ + | nrun gum choose --header "Disk partitioning failed. What now?") + case "$choice" in + Retry) + info "Re-running pre-wipe and retrying disko..." + local d + for d in $TARGET_DRIVE; do prewipe_target_drive "$d"; done + ;; + "View full log") + nrun gum pager < "$log" || less -RFX "$log" || cat "$log" + ;; + *) + rm -f "$log" + return $rc + ;; + esac + done +} + execute_installation() { if [[ "$DRY_RUN" == "true" ]]; then execute_dry_run @@ -991,67 +1142,33 @@ execute_installation() { prewipe_target_drive "$d" done - local disko_file tmp_disko - tmp_disko=$(mktemp --suffix=.nix) - + # Build the extraDrives Nix-list literal for disko-config.nix. Empty + # list = single-disk path. The list is well-formed by construction + # here (each element is a /dev/* path the user already picked) so + # there's no escaping concern — unlike the previous sed-templated Nix. local drives=($TARGET_DRIVE) - if [[ ${#drives[@]} -gt 1 ]]; then - disko_file="$NOMARCHY_REPO/installer/disko-btrfs-multi.nix" - local main_drive="${drives[0]}" - local btrfs_devs="" - local additional_disks="" - + local main_drive="${drives[0]}" + local extras_nix="[]" + if (( ${#drives[@]} > 1 )); then + extras_nix="[" + local i for (( i=1; i<${#drives[@]}; i++ )); do - local d="${drives[$i]}" - local name="extra_$i" - local luks_name="crypted_$name" - - btrfs_devs+=", \"/dev/mapper/$luks_name\"" - - additional_disks+=" $name = { - type = \"disk\"; - device = \"$d\"; - content = { - type = \"gpt\"; - partitions = { - luks = { - size = \"100%\"; - content = { - type = \"luks\"; - name = \"$luks_name\"; - settings = { - allowDiscards = true; - passwordFile = \"/dev/shm/nomarchy-luks.key\"; - }; - content = { - type = \"btrfs\"; - }; - }; - }; - }; - }; - }; -" + extras_nix+=" \"${drives[$i]}\"" done - - # Escape newlines for sed - local escaped_disks - escaped_disks=$(printf '%s\n' "$additional_disks" | sed ':a;N;$!ba;s/\n/\\n/g') - sed "s|@MAIN_DRIVE@|${main_drive}|g; s|@BTRFS_DEVICES@|${btrfs_devs}|g; s|@ADDITIONAL_DISKS@|${escaped_disks}|g" "$disko_file" > "$tmp_disko" - else - disko_file="$NOMARCHY_REPO/installer/disko-golden.nix" - sed "s|@TARGET_DRIVE@|${TARGET_DRIVE}|g" "$disko_file" > "$tmp_disko" + extras_nix+=" ]" fi # Provide the LUKS passphrase via tmpfs so the secret never touches a - # spinning disk. /dev/shm is tmpfs on the live ISO. We restrict perms - # to root and shred the file (overwrite) on the way out, even though - # it's already in RAM — defense in depth. - local luks_key="/dev/shm/nomarchy-luks.key" - install -m 600 /dev/null "$luks_key" - printf '%s' "$LUKS_PASSWORD" > "$luks_key" - disko --mode disko "$tmp_disko" - shred -u "$luks_key" 2>/dev/null || rm -f "$luks_key" + # spinning disk. /dev/shm is tmpfs on the live ISO. The EXIT trap + # below guarantees the file is removed even if the script aborts + # between writing the key and the unset below. + install -m 600 /dev/null "$_LUKS_KEY_PATH" + trap 'rm -f "$_LUKS_KEY_PATH" 2>/dev/null || true' EXIT + printf '%s' "$LUKS_PASSWORD" > "$_LUKS_KEY_PATH" + + run_disko_with_retry "$main_drive" "$extras_nix" || exit 1 + + rm -f "$_LUKS_KEY_PATH" unset LUKS_PASSWORD success "Disk partitioned"