fix: complete hybridGPU wiring + make state-derived options overridable

Two related fixes that together close the "minimal wiring" gap behind
`nomarchy.system.features.hybridGPU`.

1. Complete the NVIDIA driver stack inside hardware.nix's hybridGPU
   mkIf block.

   Before: `hybridGPU = true` enabled supergfxd and... that was it.
   supergfxd manages mode switching by black/unblacklisting the nvidia
   kernel module, but without the rest of the NVIDIA stack actually
   loaded the dGPU has no driver to drive. Hyprland/Wayland silently
   stayed on the iGPU regardless of mode.

   After: hybridGPU=true also wires
     services.xserver.videoDrivers = ["nvidia"]   (loads the driver
                                                   under Wayland too)
     hardware.graphics.{enable,enable32Bit}
     hardware.nvidia.modesetting.enable           (required for
                                                   Wayland)
     hardware.nvidia.powerManagement.enable
     hardware.nvidia.package = config.boot.kernelPackages
                                      .nvidiaPackages.stable
     boot.kernelParams += "nvidia-drm.modeset=1"

   All wired with lib.mkDefault so a downstream system.nix can pin a
   beta driver, flip to the open kernel module, or set
   `hardware.nvidia.prime.{offload.enable, intelBusId, nvidiaBusId}`
   for render-offload. The bus IDs are per-machine (find via
   `lspci -D`) so they stay user-supplied; docs/OPTIONS.md has the
   full recipe.

2. Add lib.mkDefault to every state.json-derived assignment in
   core/system/state.nix and core/home/state.nix.

   Same priority bug on both sides: assignments like
   `features.hybridGPU = systemState.features.hybridGPU or false`
   landed at default priority. A downstream system.nix saying
   `nomarchy.system.features.hybridGPU = true` would then conflict
   with the state-derived value at the same priority, and Nix would
   refuse the merge with "conflicting definition values" — the
   user's override couldn't take effect.

   Verified by an explicit eval: extending the default nixosConfig
   with `nomarchy.system.features.hybridGPU = true` now resolves
   cleanly and the full driver stack engages.

   Side-effect: core/system/state.nix now reads from
   lib/state-schema.nix like the home side does, completing the
   schema-centralization started two batches ago.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Bernardo Magri
2026-05-18 18:12:09 +01:00
parent 9c672953bc
commit d264371b46
5 changed files with 99 additions and 28 deletions

View File

@@ -14,36 +14,40 @@ let
togglesState = nomarchyLib.readHomeState config.home.homeDirectory; togglesState = nomarchyLib.readHomeState config.home.homeDirectory;
in in
{ {
# Every assignment uses lib.mkDefault so a downstream /etc/nixos/home.nix
# can override the state.json-derived value. Without mkDefault, every
# option here would resolve at default priority and conflict on
# assignment from the user's config.
config = { config = {
nomarchy = { nomarchy = {
toggles = { toggles = {
suspend = togglesState.suspend or schema.home.suspend; suspend = lib.mkDefault (togglesState.suspend or schema.home.suspend);
screensaver = togglesState.screensaver or schema.home.screensaver; screensaver = lib.mkDefault (togglesState.screensaver or schema.home.screensaver);
idle = togglesState.idle or schema.home.idle; idle = lib.mkDefault (togglesState.idle or schema.home.idle);
nightlight = togglesState.nightlight or schema.home.nightlight; nightlight = lib.mkDefault (togglesState.nightlight or schema.home.nightlight);
waybar = togglesState.waybar or schema.home.waybar; waybar = lib.mkDefault (togglesState.waybar or schema.home.waybar);
skipVsCodeTheme = togglesState.skipVsCodeTheme or schema.home.skipVsCodeTheme; skipVsCodeTheme = lib.mkDefault (togglesState.skipVsCodeTheme or schema.home.skipVsCodeTheme);
}; };
nightlightTemperature = togglesState.nightlightTemperature or schema.home.nightlightTemperature; nightlightTemperature = lib.mkDefault (togglesState.nightlightTemperature or schema.home.nightlightTemperature);
theme = togglesState.theme or schema.home.theme; theme = lib.mkDefault (togglesState.theme or schema.home.theme);
wallpaper = togglesState.wallpaper or schema.home.wallpaper; wallpaper = lib.mkDefault (togglesState.wallpaper or schema.home.wallpaper);
panelPosition = togglesState.panelPosition or schema.home.panelPosition; panelPosition = lib.mkDefault (togglesState.panelPosition or schema.home.panelPosition);
hyprland = { hyprland = {
gaps_in = togglesState.hyprland.gaps_in or schema.home.hyprland.gaps_in; gaps_in = lib.mkDefault (togglesState.hyprland.gaps_in or schema.home.hyprland.gaps_in);
gaps_out = togglesState.hyprland.gaps_out or schema.home.hyprland.gaps_out; gaps_out = lib.mkDefault (togglesState.hyprland.gaps_out or schema.home.hyprland.gaps_out);
border_size = togglesState.hyprland.border_size or schema.home.hyprland.border_size; border_size = lib.mkDefault (togglesState.hyprland.border_size or schema.home.hyprland.border_size);
}; };
fonts.monospace = togglesState.font or schema.home.font; fonts.monospace = lib.mkDefault (togglesState.font or schema.home.font);
# Derived properties from the theme directory # Derived properties from the theme directory
isLightMode = nomarchyLib.isThemeLightMode { isLightMode = lib.mkDefault (nomarchyLib.isThemeLightMode {
themeName = togglesState.theme or schema.home.theme; themeName = togglesState.theme or schema.home.theme;
inherit assetsPath; inherit assetsPath;
}; });
iconsTheme = nomarchyLib.getIconsTheme { iconsTheme = lib.mkDefault (nomarchyLib.getIconsTheme {
themeName = togglesState.theme or schema.home.theme; themeName = togglesState.theme or schema.home.theme;
inherit assetsPath; inherit assetsPath;
}; });
}; };
}; };
} }

View File

@@ -70,8 +70,47 @@ in
}) })
(mkIf config.nomarchy.system.features.hybridGPU { (mkIf config.nomarchy.system.features.hybridGPU {
# supergfxd manages mode switching (Integrated / Hybrid / Vfio /
# AsusEgpu). It blacklists/unblacklists the nvidia kernel module via
# /etc/modprobe.d/ depending on the active mode. ExecStartPre sleep
# gives udev time to settle so the daemon doesn't see a half-attached
# GPU on cold boot.
services.supergfxd.enable = true; services.supergfxd.enable = true;
systemd.services.supergfxd.serviceConfig.ExecStartPre = "-${pkgs.coreutils}/bin/sleep 5"; systemd.services.supergfxd.serviceConfig.ExecStartPre = "-${pkgs.coreutils}/bin/sleep 5";
# Load the NVIDIA driver so the dGPU has something to drive. Without
# these, supergfxd switches modes successfully but the X/Wayland
# stack has no NVIDIA driver loaded — render-offload silently no-ops
# and Hyprland renders everything on the iGPU regardless of mode.
# mkDefault throughout so downstream system.nix can override
# (pin to a beta driver, flip to the open kernel module, etc.).
services.xserver.videoDrivers = lib.mkDefault [ "nvidia" ];
hardware.graphics.enable = lib.mkDefault true;
hardware.graphics.enable32Bit = lib.mkDefault true;
hardware.nvidia = {
modesetting.enable = lib.mkDefault true;
powerManagement.enable = lib.mkDefault true;
open = lib.mkDefault false;
package = lib.mkDefault config.boot.kernelPackages.nvidiaPackages.stable;
};
# Required for Wayland compositors (Hyprland) to render via NVIDIA.
boot.kernelParams = [ "nvidia-drm.modeset=1" ];
# PRIME render-offload (the part that lets `nvidia-offload <cmd>`
# actually use the dGPU) needs bus IDs, which are per-machine.
# We deliberately don't enable `hardware.nvidia.prime.offload.enable`
# here — without the correct intelBusId / nvidiaBusId the nvidia
# kernel module panics on load. The user adds this to their own
# /etc/nixos/system.nix after running `lspci -D`:
#
# hardware.nvidia.prime = {
# offload.enable = true;
# offload.enableOffloadCmd = true;
# intelBusId = "PCI:0:2:0"; # or amdgpuBusId for AMD iGPU
# nvidiaBusId = "PCI:1:0:0";
# };
#
# See docs/OPTIONS.md for the full recipe.
}) })
]; ];
} }

View File

@@ -2,19 +2,28 @@
let let
nomarchyLib = import ../../lib { inherit lib; }; nomarchyLib = import ../../lib { inherit lib; };
# Same canonical schema as core/home/state.nix and the options.nix
# files — keeps every state default in one place.
schema = import ../../lib/state-schema.nix { inherit lib; };
systemState = nomarchyLib.readSystemState; systemState = nomarchyLib.readSystemState;
in in
{ {
# Every assignment is lib.mkDefault so a downstream /etc/nixos/system.nix
# can still set e.g. `nomarchy.system.features.hybridGPU = true;`
# without colliding with the state.json-derived value. Without
# mkDefault, every state.json-driven option was unoverridable from
# Nix — flipping hybridGPU required jq'ing the state file rather
# than declaring it in your config.
config.nomarchy.system = { config.nomarchy.system = {
dns = systemState.dns or "DHCP"; dns = lib.mkDefault (systemState.dns or schema.system.dns);
customDns = systemState.customDns or []; customDns = lib.mkDefault (systemState.customDns or schema.system.customDns);
wifi.powersave = systemState.wifi.powersave or true; wifi.powersave = lib.mkDefault (systemState.wifi.powersave or schema.system.wifi.powersave);
timezone = systemState.timezone or "UTC"; timezone = lib.mkDefault (systemState.timezone or schema.system.timezone);
features = { features = {
fingerprint = systemState.features.fingerprint or false; fingerprint = lib.mkDefault (systemState.features.fingerprint or schema.system.features.fingerprint);
fido2 = systemState.features.fido2 or false; fido2 = lib.mkDefault (systemState.features.fido2 or schema.system.features.fido2);
hybridGPU = systemState.features.hybridGPU or false; hybridGPU = lib.mkDefault (systemState.features.hybridGPU or schema.system.features.hybridGPU);
}; };
theme = systemState.theme or "nord"; theme = lib.mkDefault (systemState.theme or schema.system.theme);
}; };
} }

View File

@@ -53,7 +53,25 @@ Wired in `features/desktop/waybar/default.nix` (filters the battery widget out o
### `nomarchy.system.features.hybridGPU` ### `nomarchy.system.features.hybridGPU`
`bool`, default `false`. Enables `services.supergfxd.enable` for laptops with switchable GPUs. `bool`, default `false`. NVIDIA-hybrid laptop support. Wires:
- `services.supergfxd.enable` for runtime mode switching (`Integrated` / `Hybrid` / `Vfio` / `AsusEgpu`), driven by `nomarchy-toggle-hybrid-gpu`.
- The NVIDIA driver stack (`services.xserver.videoDrivers = ["nvidia"]`, `hardware.graphics.{enable,enable32Bit}`, `hardware.nvidia.{modesetting,powerManagement}.enable`, `boot.kernelParams = ["nvidia-drm.modeset=1"]`).
All driver knobs use `lib.mkDefault`, so a downstream `system.nix` can pin a beta driver or flip to the open kernel module without forking the module.
**You still have to add bus IDs** — they're per-machine and can't be derived from any flag. Find them with `lspci -D | grep -E 'VGA|3D'`, then in your `/etc/nixos/system.nix`:
```nix
hardware.nvidia.prime = {
offload.enable = true;
offload.enableOffloadCmd = true;
intelBusId = "PCI:0:2:0"; # or `amdgpuBusId` for AMD iGPU
nvidiaBusId = "PCI:1:0:0";
};
```
Without prime config, supergfxd still switches modes but render-offload via `nvidia-offload <cmd>` is unavailable.
### `nomarchy.system.snapper.enable` ### `nomarchy.system.snapper.enable`

View File

@@ -89,7 +89,7 @@ Each PR description should reference the row(s) in `docs/SCRIPTS.md` it closes,
- Gaming preset (Next). - Gaming preset (Next).
- Vendor matchers in `installer/hardware-db.sh`: Steam Deck, Surface, ROG Ally, Snapdragon X laptops. - Vendor matchers in `installer/hardware-db.sh`: Steam Deck, Surface, ROG Ally, Snapdragon X laptops.
- Surface support behind `nomarchy.hardware.isSurface` (Later). - Surface support behind `nomarchy.hardware.isSurface` (Later).
- Auto-detect dGPU presence and offer `programs.envycontrol`-style switching for the hybrid case (already gated behind `nomarchy.system.features.hybridGPU`, but the wiring is minimal). - Auto-detect dGPU presence in `installer/hardware-db.sh` and pre-fill `hardware.nvidia.prime.{intel,nvidia}BusId` in the generated `system.nix` (driver stack itself is Shipped — see entry below).
## 6. Pillar: Onboarding & docs ## 6. Pillar: Onboarding & docs
@@ -121,6 +121,7 @@ Each PR description should reference the row(s) in `docs/SCRIPTS.md` it closes,
(Move items here when they land — keep them brief, link the commit/PR.) (Move items here when they land — keep them brief, link the commit/PR.)
- _2026-05-18_ — Complete the hybrid-GPU wiring + fix unoverridable state-derived options. Two related fixes shipped together. **(1)** `nomarchy.system.features.hybridGPU = true` now wires the full NVIDIA driver stack (`services.xserver.videoDrivers = ["nvidia"]`, `hardware.graphics.{enable,enable32Bit}`, `hardware.nvidia.{modesetting,powerManagement}.enable`, `package = nvidiaPackages.stable`, `boot.kernelParams += "nvidia-drm.modeset=1"`) — was previously enabling only `supergfxd` mode-switching while leaving the system with no NVIDIA driver loaded, so mode switches silently no-op'd. All knobs use `lib.mkDefault` so a downstream `system.nix` can pin a beta driver, flip to the open kernel module, etc. Bus-ID prime config (per-machine) stays user-supplied — `docs/OPTIONS.md` has the full recipe. **(2)** Both `core/system/state.nix` and `core/home/state.nix` now use `lib.mkDefault` on every state.json-derived assignment, fixing a class of "I set X in my system.nix but it doesn't take effect" bugs (the state-derived value was at default priority and conflicted with the user's same-priority override). Side-effect cleanup: `core/system/state.nix` now also reads from `lib/state-schema.nix` like `core/home/state.nix` does, completing the schema-centralization started two batches ago. Verified `nix flake check` + an override test that flips hybridGPU via an overlay and confirms the entire driver stack engages.
- _2026-05-18_ — Pillar 4: pre-flight resume polish. Fixed four resume-flow gaps in `installer/install.sh`: (1) `--resume` with a missing state file now errors loudly with a tmpfs explanation instead of silently falling through to a fresh prompt cycle (the most common operator confusion was "rebooted, forgot tmpfs eats /tmp/, watched the installer start over without realising"); (2) on resume, the saved target drive is validated as a block device before any disk-phase step runs — catches the live-ISO USB-unplugged / non-deterministic /dev/sdX class of mid-install failures; (3) `save_state` now stamps an ISO-8601 timestamp and `load_state` shows a `(saved Xm ago)` banner plus a `Target: /dev/X → user @ host` summary line, so the user can `Ctrl-C` if they're resuming onto the wrong host before any destructive prompt fires; (4) `--help` now documents the tmpfs limitation. `shellcheck --severity=error` passes. - _2026-05-18_ — Pillar 4: pre-flight resume polish. Fixed four resume-flow gaps in `installer/install.sh`: (1) `--resume` with a missing state file now errors loudly with a tmpfs explanation instead of silently falling through to a fresh prompt cycle (the most common operator confusion was "rebooted, forgot tmpfs eats /tmp/, watched the installer start over without realising"); (2) on resume, the saved target drive is validated as a block device before any disk-phase step runs — catches the live-ISO USB-unplugged / non-deterministic /dev/sdX class of mid-install failures; (3) `save_state` now stamps an ISO-8601 timestamp and `load_state` shows a `(saved Xm ago)` banner plus a `Target: /dev/X → user @ host` summary line, so the user can `Ctrl-C` if they're resuming onto the wrong host before any destructive prompt fires; (4) `--help` now documents the tmpfs limitation. `shellcheck --severity=error` passes.
- _2026-05-18_ — Declarative-state defaults centralization. Made `lib/state-schema.nix` the single source of truth for every state-default that previously lived in three places (the schema itself, `core/system/options.nix` / `core/home/options.nix` `default = …` clauses, and `core/home/state.nix` `or …` fallbacks). Replaced ~25 hardcoded literals with `schema.<scope>.<key>` reads. Side-effect: fixed a lingering bug where `core/home/options.nix:theme` still defaulted to `"summer-night"` after the system-side was moved to `"nord"` — half the codebase's home option resolved to the wrong theme when state.json was missing/blank. `nix flake check --no-build` confirms zero semantic change for every other field. Doesn't touch the installer-written `state.json` (separate batch — needs schema → JSON generation). - _2026-05-18_ — Declarative-state defaults centralization. Made `lib/state-schema.nix` the single source of truth for every state-default that previously lived in three places (the schema itself, `core/system/options.nix` / `core/home/options.nix` `default = …` clauses, and `core/home/state.nix` `or …` fallbacks). Replaced ~25 hardcoded literals with `schema.<scope>.<key>` reads. Side-effect: fixed a lingering bug where `core/home/options.nix:theme` still defaulted to `"summer-night"` after the system-side was moved to `"nord"` — half the codebase's home option resolved to the wrong theme when state.json was missing/blank. `nix flake check --no-build` confirms zero semantic change for every other field. Doesn't touch the installer-written `state.json` (separate batch — needs schema → JSON generation).
- _2026-05-18_ — Pillar 7 first step: Forgejo Actions CI (eval + lint). New `.forgejo/workflows/check.yml` runs on every push to `main` and every PR: (1) `nix flake check --no-build` to catch eval regressions, (2) `bash -n` + `shellcheck --severity=error` over every `nomarchy-*` bash script (whole-tree, not just changed files — gates branches that bypass the pre-commit hook), (3) `docs/SCRIPTS.md` drift check (fails loudly if a script change didn't regenerate the audit doc). All three checks pass locally on the current tree. Activation requires enabling Actions on the Forgejo repo and registering a `forgejo-runner`; the workflow itself is dormant until then. ISO build job is intentionally deferred — needs a binary cache (Cachix/Attic) to be tractable. - _2026-05-18_ — Pillar 7 first step: Forgejo Actions CI (eval + lint). New `.forgejo/workflows/check.yml` runs on every push to `main` and every PR: (1) `nix flake check --no-build` to catch eval regressions, (2) `bash -n` + `shellcheck --severity=error` over every `nomarchy-*` bash script (whole-tree, not just changed files — gates branches that bypass the pre-commit hook), (3) `docs/SCRIPTS.md` drift check (fails loudly if a script change didn't regenerate the audit doc). All three checks pass locally on the current tree. Activation requires enabling Actions on the Forgejo repo and registering a `forgejo-runner`; the workflow itself is dormant until then. ISO build job is intentionally deferred — needs a binary cache (Cachix/Attic) to be tractable.