docs: add README, CLAUDE.md, AGENTS.md, and full docs/ suite
- README.md: project overview, quick start, command reference, workflow - CLAUDE.md: AI safety rules, technical details, conventions - AGENTS.md: agent workflows, file responsibility map, dependency matrix - docs/architecture.md: script layers, data flow, unified memory, JSON schemas - docs/optimization.md: step-by-step optimization walkthrough - docs/benchmarking.md: methodology, test params, result interpretation - docs/troubleshooting.md: common issues and fixes - docs/references.md: centralized external links (single source of truth) - docs/bios-vram-guide.md: add back-link to optimization workflow Cross-linked non-redundantly: each doc owns one layer, others link to it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
118
docs/architecture.md
Normal file
118
docs/architecture.md
Normal file
@@ -0,0 +1,118 @@
|
||||
# Architecture
|
||||
|
||||
## Script Layers
|
||||
|
||||
```
|
||||
bin/ User entry points (thin dispatchers)
|
||||
audit → scripts/audit/
|
||||
monitor → scripts/monitor/
|
||||
benchmark → scripts/benchmark/
|
||||
optimize → scripts/optimize/
|
||||
|
||||
scripts/ Implementation (sourcing lib/ for shared logic)
|
||||
audit/ System assessment
|
||||
monitor/ GPU/system monitoring + metrics logging
|
||||
benchmark/ llama-bench via toolbox containers
|
||||
optimize/ GRUB, tuned, BIOS guidance, verify, rollback
|
||||
|
||||
lib/ Shared bash libraries (sourced, not executed)
|
||||
common.sh Logging, root checks, confirm prompts, paths
|
||||
detect.sh Hardware/config detection from sysfs + system tools
|
||||
format.sh Color output, human_bytes, status indicators, tables
|
||||
|
||||
configs/ Reference configuration templates
|
||||
data/ Runtime output (gitignored)
|
||||
docs/ Documentation
|
||||
```
|
||||
|
||||
Every script sources libs in order: `common.sh` → `detect.sh` → `format.sh`. `format.sh` depends on color variables defined in `common.sh`.
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
/sys/class/drm/card1/device/ ──→ lib/detect.sh ──→ scripts/audit/
|
||||
/proc/cpuinfo, /proc/meminfo ──→ (detect_*) ──→ scripts/monitor/
|
||||
/proc/cmdline ──→ ──→ scripts/optimize/
|
||||
tuned-adm, rpm, lspci ──→ ──→ scripts/benchmark/
|
||||
│
|
||||
▼
|
||||
data/
|
||||
├── audits/*.json
|
||||
├── logs/*.csv
|
||||
├── baselines/*/summary.json
|
||||
└── benchmarks/*/summary.json
|
||||
```
|
||||
|
||||
## Unified Memory Model
|
||||
|
||||
AMD Strix Halo shares physical RAM between CPU and GPU. Two allocation mechanisms:
|
||||
|
||||
| Type | Description | Configuration |
|
||||
|------|-------------|---------------|
|
||||
| **VRAM (dedicated)** | Permanently reserved for GPU framebuffer | BIOS setting (UMA Frame Buffer Size) |
|
||||
| **GTT (dynamic)** | System RAM mapped into GPU address space on demand | Kernel boot params: `amdgpu.gttsize`, `ttm.pages_limit` |
|
||||
|
||||
**Optimal for LLM workloads**: Minimize VRAM (0.5 GiB), maximize GTT (~60 GiB on 64 GB system). The GPU borrows memory when needed and releases it when idle.
|
||||
|
||||
### Kernel Parameter Math (64 GB system)
|
||||
|
||||
```
|
||||
Total physical RAM: 64 GiB
|
||||
OS reserve: 4 GiB
|
||||
Available for GTT: 60 GiB = 61440 MiB
|
||||
|
||||
amdgpu.gttsize = 60 * 1024 = 61440 (MiB)
|
||||
ttm.pages_limit = 60 * 1024 * 256 = 15728640 (4K pages)
|
||||
iommu = pt (passthrough, lower latency)
|
||||
```
|
||||
|
||||
The toolkit computes these dynamically via `recommended_gttsize_mib()` and `recommended_pages_limit()` in `lib/detect.sh`, based on detected total physical RAM (visible + VRAM).
|
||||
|
||||
### Sysfs Paths
|
||||
|
||||
| Path | Content |
|
||||
|------|---------|
|
||||
| `/sys/class/drm/card1/device/mem_info_vram_total` | Dedicated VRAM in bytes |
|
||||
| `/sys/class/drm/card1/device/mem_info_vram_used` | VRAM currently in use |
|
||||
| `/sys/class/drm/card1/device/mem_info_gtt_total` | GTT allocation in bytes |
|
||||
| `/sys/class/drm/card1/device/mem_info_gtt_used` | GTT currently in use |
|
||||
| `/sys/class/drm/card1/device/gpu_busy_percent` | GPU utilization 0-100 |
|
||||
| `/sys/class/drm/card1/device/hwmon/hwmon*/temp1_input` | Temperature in millidegrees C |
|
||||
| `/sys/class/drm/card1/device/hwmon/hwmon*/power1_average` | Power in microwatts |
|
||||
|
||||
Card number is auto-detected by `find_gpu_card()` (matches AMD vendor ID `0x1002`).
|
||||
|
||||
## JSON Output Schemas
|
||||
|
||||
### system-state.json (from `audit --json`)
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "20260325-120000",
|
||||
"hardware": { "cpu_model": "...", "cpu_cores": 16, "cpu_threads": 32, "gpu_name": "...", "gpu_device_id": "1586", "system_ram_kb": 32609248 },
|
||||
"memory": { "vram_total_bytes": 0, "vram_used_bytes": 0, "gtt_total_bytes": 0, "gtt_used_bytes": 0, "recommended_gttsize_mib": 0, "recommended_pages_limit": 0 },
|
||||
"kernel": { "version": "...", "cmdline": "...", "param_iommu": "", "param_gttsize": "", "param_pages_limit": "" },
|
||||
"firmware": "...", "tuned_profile": "...", "rocm_version": "...",
|
||||
"vulkan": { "driver": "...", "version": "..." },
|
||||
"sensors": { "gpu_temp_mc": 0, "gpu_power_uw": 0, "gpu_busy_pct": 0 },
|
||||
"toolboxes": [], "stacks": { "ollama": "...", "lmstudio": "...", "llamacpp": "...", "opencode": "..." }
|
||||
}
|
||||
```
|
||||
|
||||
### summary.json (from benchmark runs)
|
||||
|
||||
```json
|
||||
{
|
||||
"results": [
|
||||
{ "file": "model__backend__fa1.log", "model": "...", "size": "...", "backend": "Vulkan", "test": "pp512", "tokens_per_sec": 548.18, "raw": "548.18 +/- 1.59" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### metrics.csv (from `monitor --log`)
|
||||
|
||||
```
|
||||
timestamp,gpu_busy_pct,vram_used_mib,gtt_used_mib,gpu_temp_c,gpu_power_w,cpu_pct,ram_used_mib
|
||||
```
|
||||
|
||||
Sampled every 2 seconds by default. Pure bash implementation (no subshell forks per sample).
|
||||
94
docs/benchmarking.md
Normal file
94
docs/benchmarking.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# Benchmarking
|
||||
|
||||
## What We Measure
|
||||
|
||||
All benchmarks use [llama-bench](https://github.com/ggml-org/llama.cpp) (part of llama.cpp) running inside toolbox containers. Two test types:
|
||||
|
||||
| Metric | Meaning | Test Params |
|
||||
|--------|---------|-------------|
|
||||
| **pp** (prompt processing) | How fast the model ingests input tokens | Default: 512 tokens |
|
||||
| **tg** (token generation) | How fast the model produces output tokens | Default: 128 tokens |
|
||||
|
||||
Results are in **tokens/second (t/s)**. Higher is better.
|
||||
|
||||
## Test Parameters
|
||||
|
||||
### Standard Test
|
||||
```
|
||||
-ngl 99 -mmp 0 -fa 1 -r 5
|
||||
```
|
||||
- `-ngl 99` — all layers on GPU
|
||||
- `-mmp 0` — disable memory mapping (`--no-mmap`)
|
||||
- `-fa 1` — flash attention enabled
|
||||
- `-r 5` — 5 repetitions for statistical confidence
|
||||
|
||||
### Long-Context Test
|
||||
```
|
||||
-ngl 99 -mmp 0 -fa 1 -p 2048 -n 32 -d 32768 -ub SIZE -r 3
|
||||
```
|
||||
- `-p 2048` — 2048 prompt tokens
|
||||
- `-n 32` — generate 32 tokens
|
||||
- `-d 32768` — 32K context window
|
||||
- `-ub SIZE` — micro-batch size (512 for Vulkan, 2048 for ROCm)
|
||||
- `-r 3` — 3 repetitions (long-context tests are slow)
|
||||
|
||||
The `-fa 1 --no-mmap -ngl 999` flags are **mandatory** on Strix Halo to avoid crashes.
|
||||
|
||||
## Available Backends
|
||||
|
||||
| Backend | Container | Technology | Notes |
|
||||
|---------|-----------|------------|-------|
|
||||
| `llama-vulkan-radv` | Mesa RADV | Vulkan | Most stable, recommended default |
|
||||
| `llama-vulkan-amdvlk` | AMDVLK | Vulkan | Fastest when it works, 2GB buffer limit |
|
||||
| `llama-rocm-6.4.4` | ROCm 6.4.4 | HIP | Proven stable |
|
||||
| `llama-rocm-7.2` | ROCm 7.2 | HIP | Latest, compiler fixes applied |
|
||||
|
||||
Containers are from [kyuz0/amd-strix-halo-toolboxes](https://github.com/kyuz0/amd-strix-halo-toolboxes). Set up with `make benchmark-setup`.
|
||||
|
||||
## Workflow
|
||||
|
||||
```bash
|
||||
# 1. Setup (one-time)
|
||||
make benchmark-setup
|
||||
|
||||
# 2. Capture baseline (before optimization)
|
||||
make benchmark-baseline
|
||||
|
||||
# 3. After optimizing, run again
|
||||
make benchmark # or: bin/benchmark run --tag post-opt
|
||||
|
||||
# 4. Compare
|
||||
make benchmark-compare BEFORE=data/baselines/20260325-120000 AFTER=data/benchmarks/post-opt-20260326-100000
|
||||
```
|
||||
|
||||
## Result Format
|
||||
|
||||
Each run produces a directory under `data/baselines/` or `data/benchmarks/`:
|
||||
|
||||
```
|
||||
TIMESTAMP/
|
||||
system-state.json # Full system audit snapshot
|
||||
summary.json # Parsed results (model, backend, test, t/s)
|
||||
metrics.csv # GPU/CPU metrics during the run
|
||||
*.log # Raw llama-bench output per backend+model+test
|
||||
```
|
||||
|
||||
### Comparison Output
|
||||
|
||||
```
|
||||
Backend | Model | Test | Before | After | Delta
|
||||
vulkan-radv | qwen3-4b | pp512 | 548 t/s | 612 t/s | +11.7%
|
||||
vulkan-radv | qwen3-4b | tg128 | 13.9 | 15.2 | +9.4%
|
||||
```
|
||||
|
||||
Configuration changes between runs (VRAM, GTT, kernel params, tuned profile) are shown if system-state.json differs.
|
||||
|
||||
## Recommended Test Models
|
||||
|
||||
| Size | Model | File | Disk | Use Case |
|
||||
|------|-------|------|------|----------|
|
||||
| Small | Qwen3-4B | Q4_K_M.gguf | ~3 GB | Quick smoke tests |
|
||||
| Medium | Qwen3-14B | Q4_K_M.gguf | ~9 GB | Standard benchmarks |
|
||||
| Large | Qwen3-32B | Q4_K_M.gguf | ~20 GB | Memory pressure tests |
|
||||
|
||||
Place models in `data/models/`. The VRAM estimator from the [toolboxes project](https://github.com/kyuz0/amd-strix-halo-toolboxes) (`gguf-vram-estimator.py`) can help plan which models fit.
|
||||
@@ -1,5 +1,7 @@
|
||||
# BIOS VRAM Configuration — HP ZBook Ultra G1a
|
||||
|
||||
> Part of the [optimization workflow](optimization.md). For the full context on unified memory, see [architecture.md](architecture.md).
|
||||
|
||||
## Why Change VRAM?
|
||||
|
||||
AMD Strix Halo uses **unified memory** — the CPU and GPU share the same physical RAM. By default, the HP ZBook allocates **32 GB as dedicated VRAM**, permanently locking that memory away from the OS even when the GPU isn't using it.
|
||||
|
||||
84
docs/optimization.md
Normal file
84
docs/optimization.md
Normal file
@@ -0,0 +1,84 @@
|
||||
# Optimization Guide
|
||||
|
||||
Complete walkthrough for optimizing AMD Strix Halo for LLM workloads.
|
||||
|
||||
**Prerequisites**: Run `make audit` first to see your current state. Run `make benchmark-baseline` to capture pre-optimization performance numbers.
|
||||
|
||||
## Step 1: Tuned Profile (no reboot)
|
||||
|
||||
```bash
|
||||
sudo make optimize-tuned
|
||||
```
|
||||
|
||||
Switches from `throughput-performance` to `accelerator-performance`, which disables higher-latency CPU STOP states. Provides 5-8% improvement in prompt processing throughput.
|
||||
|
||||
Takes effect immediately. Previous profile is saved for rollback.
|
||||
|
||||
## Step 2: Kernel Boot Parameters (reboot required)
|
||||
|
||||
```bash
|
||||
sudo make optimize-kernel
|
||||
```
|
||||
|
||||
Adds three parameters to GRUB:
|
||||
|
||||
| Parameter | Value (64 GB) | Purpose |
|
||||
|-----------|--------------|---------|
|
||||
| `iommu=pt` | — | IOMMU passthrough, reduces memory access latency |
|
||||
| `amdgpu.gttsize` | `60416` | Max GPU-addressable system RAM in MiB |
|
||||
| `ttm.pages_limit` | `15466496` | Max pinnable 4K pages for GPU memory |
|
||||
|
||||
Values are computed dynamically based on your system's total physical RAM. The script backs up `/etc/default/grub` before modifying it.
|
||||
|
||||
See [docs/architecture.md](architecture.md) for the math behind these values.
|
||||
|
||||
## Step 3: BIOS VRAM Reduction (reboot + BIOS access)
|
||||
|
||||
```bash
|
||||
make optimize-vram
|
||||
```
|
||||
|
||||
This prints guidance — it cannot modify BIOS directly. The goal is to reduce dedicated VRAM from 32 GB to 0.5 GB, freeing 31.5 GB back to the OS for dynamic GPU access via GTT.
|
||||
|
||||
See [docs/bios-vram-guide.md](bios-vram-guide.md) for the full BIOS walkthrough.
|
||||
|
||||
**Combine Steps 2 and 3 into a single reboot**: apply kernel params, then reboot into BIOS (F10) to change VRAM, then boot normally.
|
||||
|
||||
## Step 4: Verify
|
||||
|
||||
```bash
|
||||
make verify
|
||||
```
|
||||
|
||||
Checks 9 criteria and reports a score. Target: 9/9.
|
||||
|
||||
## Step 5: Measure Impact
|
||||
|
||||
```bash
|
||||
make benchmark
|
||||
make benchmark-compare BEFORE=data/baselines/TIMESTAMP AFTER=data/benchmarks/TAG-TIMESTAMP
|
||||
```
|
||||
|
||||
See [docs/benchmarking.md](benchmarking.md) for methodology and result interpretation.
|
||||
|
||||
## Expected Impact
|
||||
|
||||
| Optimization | pp512 Improvement | tg128 Improvement |
|
||||
|-------------|-------------------|-------------------|
|
||||
| Tuned profile | +5-8% | +2-3% |
|
||||
| Kernel params + BIOS VRAM | +10-20% | +5-15% |
|
||||
| **Combined** | **+15-25%** | **+8-18%** |
|
||||
|
||||
Numbers vary by model size and backend. Larger models see bigger gains from GTT expansion.
|
||||
|
||||
## Rollback
|
||||
|
||||
```bash
|
||||
sudo make rollback
|
||||
```
|
||||
|
||||
Restores GRUB backup and previous tuned profile. BIOS VRAM must be reverted manually (F10 → restore previous UMA Frame Buffer Size).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
If anything goes wrong, see [docs/troubleshooting.md](troubleshooting.md).
|
||||
49
docs/references.md
Normal file
49
docs/references.md
Normal file
@@ -0,0 +1,49 @@
|
||||
# External References
|
||||
|
||||
Single source of truth for all external links used across this project.
|
||||
|
||||
## AMD Official
|
||||
|
||||
- [ROCm Strix Halo Optimization Guide](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/strixhalo.html) — BIOS, kernel params, GTT/TTM configuration
|
||||
- [ROCm System Optimization Index](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/index.html) — General ROCm tuning
|
||||
- [ROCm Installation Guide (Linux)](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/) — Package installation
|
||||
- [AMD SMI Documentation](https://rocm.docs.amd.com/projects/amdsmi/en/latest/) — GPU monitoring API
|
||||
- [ROCm GitHub](https://github.com/ROCm/ROCm) — Source and issue tracker
|
||||
|
||||
## Strix Halo Toolboxes (Donato Capitella)
|
||||
|
||||
The most comprehensive community resource for Strix Halo LLM optimization.
|
||||
|
||||
- [strix-halo-toolboxes.com](https://strix-halo-toolboxes.com/) — Documentation, benchmarks, guides
|
||||
- [GitHub: kyuz0/amd-strix-halo-toolboxes](https://github.com/kyuz0/amd-strix-halo-toolboxes) — Container images, benchmark scripts, VRAM estimator
|
||||
- [Benchmark Results Viewer](https://kyuz0.github.io/amd-strix-halo-toolboxes/) — Interactive performance charts
|
||||
|
||||
## Community
|
||||
|
||||
- [Strix Halo Wiki — AI Capabilities](https://strixhalo.wiki/AI/AI_Capabilities_Overview) — Community benchmarks, model compatibility
|
||||
- [Level1Techs Forum — HP G1a Guide](https://forum.level1techs.com/t/the-ultimate-arch-secureboot-guide-for-ryzen-ai-max-ft-hp-g1a-128gb-8060s-monster-laptop/230652) — Laptop-specific configuration
|
||||
- [Framework Community — GPU Performance Tests](https://community.frame.work/t/amd-strix-halo-ryzen-ai-max-395-gpu-llm-performance-tests/72521) — Framework Desktop results
|
||||
- [LLM Tracker — Strix Halo](https://llm-tracker.info/_TOORG/Strix-Halo) — Centralized performance database
|
||||
|
||||
## Other Strix Halo Repos
|
||||
|
||||
- [pablo-ross/strix-halo-gmktec-evo-x2](https://github.com/pablo-ross/strix-halo-gmktec-evo-x2) — GMKtec EVO X2 optimization
|
||||
- [kyuz0/amd-strix-halo-llm-finetuning](https://github.com/kyuz0/amd-strix-halo-llm-finetuning) — Fine-tuning guides (Gemma-3, Qwen-3)
|
||||
|
||||
## Monitoring Tools
|
||||
|
||||
- [amdgpu_top](https://github.com/Umio-Yasuno/amdgpu_top) — Best AMD GPU monitor (TUI/GUI/JSON)
|
||||
- [nvtop](https://github.com/Syllo/nvtop) — Cross-vendor GPU monitor
|
||||
- [btop](https://github.com/aristocratos/btop) — System resource monitor
|
||||
|
||||
## LLM Inference
|
||||
|
||||
- [llama.cpp](https://github.com/ggml-org/llama.cpp) — LLM inference engine (Vulkan + ROCm)
|
||||
- [ollama](https://ollama.com/) — LLM runtime with model management
|
||||
- [vLLM](https://github.com/vllm-project/vllm) — High-throughput serving
|
||||
- [llama-benchy](https://github.com/eugr/llama-benchy) — Multi-backend LLM benchmarking
|
||||
|
||||
## AMD GPU Profiling
|
||||
|
||||
- [Radeon GPU Profiler (RGP)](https://gpuopen.com/rgp/) — Hardware-level Vulkan/HIP profiling
|
||||
- [Radeon GPU Analyzer (RGA)](https://gpuopen.com/rga/) — Offline shader/kernel analysis
|
||||
96
docs/troubleshooting.md
Normal file
96
docs/troubleshooting.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# Troubleshooting
|
||||
|
||||
## Firmware: linux-firmware 20251125 Causes ROCm Crashes
|
||||
|
||||
**Symptoms**: Arbitrary crashes, instability, or mysterious failures with ROCm workloads.
|
||||
|
||||
**Check**: `rpm -qa | grep linux-firmware`
|
||||
|
||||
**Fix**: Downgrade to 20251111 or upgrade to 20260110+. After changing firmware:
|
||||
```bash
|
||||
sudo dracut -f --kver $(uname -r)
|
||||
```
|
||||
|
||||
The toolkit checks this automatically — `make audit` shows firmware status.
|
||||
|
||||
## amdgpu_top: Cargo Build Fails (gix-hash error)
|
||||
|
||||
**Symptoms**: `error: Please set either the sha1 or sha256 feature flag` during `cargo install amdgpu_top`.
|
||||
|
||||
**Cause**: Rust toolchain version incompatibility with the `gix-hash` dependency.
|
||||
|
||||
**Fix**: Use the pre-built RPM instead:
|
||||
```bash
|
||||
make monitor-install
|
||||
```
|
||||
The install script downloads the RPM from GitHub releases, bypassing cargo entirely.
|
||||
|
||||
## Toolbox GPU Access Failure
|
||||
|
||||
**Symptoms**: `llama-cli --list-devices` shows no GPU inside a toolbox container.
|
||||
|
||||
**Check**: Device mappings when creating the toolbox:
|
||||
- Vulkan backends need: `--device /dev/dri`
|
||||
- ROCm backends need: `--device /dev/dri --device /dev/kfd`
|
||||
|
||||
**Fix**: Recreate the toolbox with correct device flags. The [refresh-toolboxes.sh](https://github.com/kyuz0/amd-strix-halo-toolboxes) script handles this automatically.
|
||||
|
||||
Also ensure your user is in the `video` and `render` groups:
|
||||
```bash
|
||||
sudo usermod -aG video,render $USER
|
||||
```
|
||||
|
||||
## GRUB Changes Not Taking Effect
|
||||
|
||||
**Symptoms**: After `make optimize-kernel` and reboot, `make audit` still shows missing params.
|
||||
|
||||
**Possible causes**:
|
||||
|
||||
1. **BLS (Boot Loader Spec)**: Modern Fedora uses BLS entries. The script uses `grubby` when available, but verify:
|
||||
```bash
|
||||
grubby --info=ALL | grep args
|
||||
```
|
||||
|
||||
2. **Wrong GRUB config path**: Check which config is actually used:
|
||||
```bash
|
||||
cat /proc/cmdline # what the kernel actually booted with
|
||||
cat /etc/default/grub # what the script modified
|
||||
```
|
||||
|
||||
3. **GRUB not regenerated**: Manually regenerate:
|
||||
```bash
|
||||
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
|
||||
```
|
||||
|
||||
## Memory Unchanged After BIOS Change
|
||||
|
||||
**Symptoms**: Changed VRAM in BIOS but `make audit` still shows 32 GiB.
|
||||
|
||||
**Check**:
|
||||
```bash
|
||||
cat /sys/class/drm/card1/device/mem_info_vram_total
|
||||
```
|
||||
|
||||
**Possible causes**:
|
||||
- BIOS change not saved (verify by re-entering BIOS)
|
||||
- Wrong BIOS setting modified (look for "UMA Frame Buffer Size", not "Shared Memory")
|
||||
- Kernel params not applied (VRAM reduction requires kernel params to be useful)
|
||||
|
||||
## Benchmark Failures
|
||||
|
||||
**Symptoms**: `make benchmark-baseline` reports "FAILED" for some backends.
|
||||
|
||||
**Common fixes**:
|
||||
- Ensure model exists: `ls data/models/*.gguf`
|
||||
- Check model fits in memory: small models (4B) for initial testing
|
||||
- Try `llama-vulkan-radv` first (most stable backend)
|
||||
- Check dmesg for GPU errors: `dmesg | tail -30`
|
||||
|
||||
## Rollback
|
||||
|
||||
If optimization causes issues:
|
||||
```bash
|
||||
sudo make rollback
|
||||
```
|
||||
|
||||
This restores the GRUB backup and previous tuned profile. BIOS changes must be reverted manually (F10 at boot). See [docs/optimization.md](optimization.md) for the full rollback procedure.
|
||||
Reference in New Issue
Block a user