- README.md: project overview, quick start, command reference, workflow - CLAUDE.md: AI safety rules, technical details, conventions - AGENTS.md: agent workflows, file responsibility map, dependency matrix - docs/architecture.md: script layers, data flow, unified memory, JSON schemas - docs/optimization.md: step-by-step optimization walkthrough - docs/benchmarking.md: methodology, test params, result interpretation - docs/troubleshooting.md: common issues and fixes - docs/references.md: centralized external links (single source of truth) - docs/bios-vram-guide.md: add back-link to optimization workflow Cross-linked non-redundantly: each doc owns one layer, others link to it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2.6 KiB
Optimization Guide
Complete walkthrough for optimizing AMD Strix Halo for LLM workloads.
Prerequisites: Run make audit first to see your current state. Run make benchmark-baseline to capture pre-optimization performance numbers.
Step 1: Tuned Profile (no reboot)
sudo make optimize-tuned
Switches from throughput-performance to accelerator-performance, which disables higher-latency CPU STOP states. Provides 5-8% improvement in prompt processing throughput.
Takes effect immediately. Previous profile is saved for rollback.
Step 2: Kernel Boot Parameters (reboot required)
sudo make optimize-kernel
Adds three parameters to GRUB:
| Parameter | Value (64 GB) | Purpose |
|---|---|---|
iommu=pt |
— | IOMMU passthrough, reduces memory access latency |
amdgpu.gttsize |
60416 |
Max GPU-addressable system RAM in MiB |
ttm.pages_limit |
15466496 |
Max pinnable 4K pages for GPU memory |
Values are computed dynamically based on your system's total physical RAM. The script backs up /etc/default/grub before modifying it.
See docs/architecture.md for the math behind these values.
Step 3: BIOS VRAM Reduction (reboot + BIOS access)
make optimize-vram
This prints guidance — it cannot modify BIOS directly. The goal is to reduce dedicated VRAM from 32 GB to 0.5 GB, freeing 31.5 GB back to the OS for dynamic GPU access via GTT.
See docs/bios-vram-guide.md for the full BIOS walkthrough.
Combine Steps 2 and 3 into a single reboot: apply kernel params, then reboot into BIOS (F10) to change VRAM, then boot normally.
Step 4: Verify
make verify
Checks 9 criteria and reports a score. Target: 9/9.
Step 5: Measure Impact
make benchmark
make benchmark-compare BEFORE=data/baselines/TIMESTAMP AFTER=data/benchmarks/TAG-TIMESTAMP
See docs/benchmarking.md for methodology and result interpretation.
Expected Impact
| Optimization | pp512 Improvement | tg128 Improvement |
|---|---|---|
| Tuned profile | +5-8% | +2-3% |
| Kernel params + BIOS VRAM | +10-20% | +5-15% |
| Combined | +15-25% | +8-18% |
Numbers vary by model size and backend. Larger models see bigger gains from GTT expansion.
Rollback
sudo make rollback
Restores GRUB backup and previous tuned profile. BIOS VRAM must be reverted manually (F10 → restore previous UMA Frame Buffer Size).
Troubleshooting
If anything goes wrong, see docs/troubleshooting.md.