strix-halo-optimizations/docs/optimization.md

# Optimization Guide

Complete walkthrough for optimizing AMD Strix Halo for LLM workloads.

**Prerequisites**: Run `make audit` first to see your current state. Run `make benchmark-baseline` to capture pre-optimization performance numbers.

## Step 1: Tuned Profile (no reboot)

```bash
sudo make optimize-tuned
```

Switches from `throughput-performance` to `accelerator-performance`, which disables higher-latency CPU STOP states. Provides 5-8% improvement in prompt processing throughput.

Takes effect immediately. Previous profile is saved for rollback.

## Step 2: Kernel Boot Parameters (reboot required)

```bash
sudo make optimize-kernel
```

Adds three parameters to GRUB:

| Parameter | Value (64 GB) | Purpose |
|-----------|--------------|---------|
| `iommu=pt` | — | IOMMU passthrough, reduces memory access latency |
| `amdgpu.gttsize` | `60416` | Max GPU-addressable system RAM in MiB |
| `ttm.pages_limit` | `15466496` | Max pinnable 4K pages for GPU memory |

Values are computed dynamically based on your system's total physical RAM. The script backs up `/etc/default/grub` before modifying it.

See [docs/architecture.md](architecture.md) for the math behind these values.

## Step 3: BIOS VRAM Reduction (reboot + BIOS access)

```bash
make optimize-vram
```

This prints guidance — it cannot modify BIOS directly. The goal is to reduce dedicated VRAM from 32 GB to 0.5 GB, freeing 31.5 GB back to the OS for dynamic GPU access via GTT.

See [docs/bios-vram-guide.md](bios-vram-guide.md) for the full BIOS walkthrough.

**Combine Steps 2 and 3 into a single reboot**: apply kernel params, then reboot into BIOS (F10) to change VRAM, then boot normally.

## Step 4: Verify

```bash
make verify
```

Checks 9 criteria and reports a score. Target: 9/9.

## Step 5: Measure Impact

```bash
make benchmark
make benchmark-compare BEFORE=data/baselines/TIMESTAMP AFTER=data/benchmarks/TAG-TIMESTAMP
```

See [docs/benchmarking.md](benchmarking.md) for methodology and result interpretation.

## Expected Impact

| Optimization | pp512 Improvement | tg128 Improvement |
|-------------|-------------------|-------------------|
| Tuned profile | +5-8% | +2-3% |
| Kernel params + BIOS VRAM | +10-20% | +5-15% |
| **Combined** | **+15-25%** | **+8-18%** |

Numbers vary by model size and backend. Larger models see bigger gains from GTT expansion.

## Rollback

```bash
sudo make rollback
```

Restores GRUB backup and previous tuned profile. BIOS VRAM must be reverted manually (F10 → restore previous UMA Frame Buffer Size).

## Troubleshooting

If anything goes wrong, see [docs/troubleshooting.md](troubleshooting.md).