Files
strix-halo-optimizations/docs/optimization.md
Felipe Cardoso 5b81437637 docs: add README, CLAUDE.md, AGENTS.md, and full docs/ suite
- README.md: project overview, quick start, command reference, workflow
- CLAUDE.md: AI safety rules, technical details, conventions
- AGENTS.md: agent workflows, file responsibility map, dependency matrix
- docs/architecture.md: script layers, data flow, unified memory, JSON schemas
- docs/optimization.md: step-by-step optimization walkthrough
- docs/benchmarking.md: methodology, test params, result interpretation
- docs/troubleshooting.md: common issues and fixes
- docs/references.md: centralized external links (single source of truth)
- docs/bios-vram-guide.md: add back-link to optimization workflow

Cross-linked non-redundantly: each doc owns one layer, others link to it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 20:50:00 +01:00

2.6 KiB

Optimization Guide

Complete walkthrough for optimizing AMD Strix Halo for LLM workloads.

Prerequisites: Run make audit first to see your current state. Run make benchmark-baseline to capture pre-optimization performance numbers.

Step 1: Tuned Profile (no reboot)

sudo make optimize-tuned

Switches from throughput-performance to accelerator-performance, which disables higher-latency CPU STOP states. Provides 5-8% improvement in prompt processing throughput.

Takes effect immediately. Previous profile is saved for rollback.

Step 2: Kernel Boot Parameters (reboot required)

sudo make optimize-kernel

Adds three parameters to GRUB:

Parameter Value (64 GB) Purpose
iommu=pt IOMMU passthrough, reduces memory access latency
amdgpu.gttsize 60416 Max GPU-addressable system RAM in MiB
ttm.pages_limit 15466496 Max pinnable 4K pages for GPU memory

Values are computed dynamically based on your system's total physical RAM. The script backs up /etc/default/grub before modifying it.

See docs/architecture.md for the math behind these values.

Step 3: BIOS VRAM Reduction (reboot + BIOS access)

make optimize-vram

This prints guidance — it cannot modify BIOS directly. The goal is to reduce dedicated VRAM from 32 GB to 0.5 GB, freeing 31.5 GB back to the OS for dynamic GPU access via GTT.

See docs/bios-vram-guide.md for the full BIOS walkthrough.

Combine Steps 2 and 3 into a single reboot: apply kernel params, then reboot into BIOS (F10) to change VRAM, then boot normally.

Step 4: Verify

make verify

Checks 9 criteria and reports a score. Target: 9/9.

Step 5: Measure Impact

make benchmark
make benchmark-compare BEFORE=data/baselines/TIMESTAMP AFTER=data/benchmarks/TAG-TIMESTAMP

See docs/benchmarking.md for methodology and result interpretation.

Expected Impact

Optimization pp512 Improvement tg128 Improvement
Tuned profile +5-8% +2-3%
Kernel params + BIOS VRAM +10-20% +5-15%
Combined +15-25% +8-18%

Numbers vary by model size and backend. Larger models see bigger gains from GTT expansion.

Rollback

sudo make rollback

Restores GRUB backup and previous tuned profile. BIOS VRAM must be reverted manually (F10 → restore previous UMA Frame Buffer Size).

Troubleshooting

If anything goes wrong, see docs/troubleshooting.md.