strix-halo-optimizations

Author	SHA1	Message	Date
Felipe Cardoso	ea70687cd2	docs: update optimization guide with measured hardware data Replace estimated values with clpeak measurements: DRAM 216-233 GB/s, GPU clocks confirmed 2900 MHz under load (ROCm #5750 is sysfs reporting only). Correct backend recommendation to Vulkan RADV (2.7x faster tg than ROCm at 131K). Update KV cache recommendation to q4_0. Add Nemotron-Cascade-2 to coder shootout results. Remove Nemotron-3-Nano from catalog (replaced by Cascade-2). Update Q4_K_L to Q4_K_XL entry.	2026-03-30 19:56:18 +02:00
Felipe Cardoso	1549bc27c0	feat(optimize): add Phase 2 power profile and system tuning Add `make optimize-power` (ryzenadj 85W, sysctl, THP, RADV nogttspill) with systemd services for boot/resume persistence. Integrate into `make optimize --all` as Phase 2. Update optimization log with RyzenAdj results (+46% tg at 70W sustained), KV sweep data, and quant shootout. Add Qwen3-Coder-30B and Nemotron-Cascade-2 to model catalog.	2026-03-30 18:53:52 +02:00
Felipe Cardoso	f92b710492	fix(benchmark): parse llama-bench output with variable column count KV cache quantization adds type_k/type_v columns to llama-bench output, shifting test and t/s to different indices. Parse from end of row instead of hardcoded positions. Also fix KV suffix separator (underscore to dash) to avoid regex ambiguity with type names like q8_0. Add 5-phase optimization guide, optimization log for tracking results, and research docs on llama.cpp and inference landscape optimizations.	2026-03-27 14:54:19 +01:00
Felipe Cardoso	58124cd657	feat: add Qwen3.5 model catalog and agentic evaluation framework Models: - configs/models.conf: catalog with Qwen3.5-35B-A3B (MoE, top pick), Qwen3.5-27B (dense), Qwen3-Coder-30B-A3B (agentic/coding) - Updated benchmark setup to show catalog with download status - docs/model-recommendations.md: memory planning, quantization guide Agentic evaluation: - scripts/agentic/setup.sh: installs inspect-ai, evalplus, bigcodebench in a Python venv - scripts/agentic/run-eval.sh: runs evaluations against local LLM server (ollama or llama.cpp). Suites: quick (HumanEval+IFEval), code (EvalPlus+BigCodeBench), tooluse (BFCL), full (all) - bin/agentic: dispatcher with help - docs/agentic-benchmarks.md: methodology, framework comparison, model recommendations for agentic use Updated: Makefile (6 new targets), README, CLAUDE.md, docs/references.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 00:20:23 +01:00
Felipe Cardoso	da2c4c6b8a	fix(docs): address review findings — accuracy, consistency, completeness - architecture.md: fix kernel param math to match actual computed values, use cardN placeholder in sysfs paths, clarify system_ram_kb is OS-visible - benchmarking.md: normalize flags to -ngl 99 / -mmp 0 (matching code), add llama-rocm7-nightlies backend - CLAUDE.md: clarify HSA_OVERRIDE_GFX_VERSION is set in containers not scripts, fix lib sourcing description, specify which scripts need root - detect.sh: document detect_cpu_cores returns threads not cores - troubleshooting.md: add link to references.md - README.md: remove unsupported Fedora 42 claim, describe configs/ content Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 21:44:16 +01:00
Felipe Cardoso	5b81437637	docs: add README, CLAUDE.md, AGENTS.md, and full docs/ suite - README.md: project overview, quick start, command reference, workflow - CLAUDE.md: AI safety rules, technical details, conventions - AGENTS.md: agent workflows, file responsibility map, dependency matrix - docs/architecture.md: script layers, data flow, unified memory, JSON schemas - docs/optimization.md: step-by-step optimization walkthrough - docs/benchmarking.md: methodology, test params, result interpretation - docs/troubleshooting.md: common issues and fixes - docs/references.md: centralized external links (single source of truth) - docs/bios-vram-guide.md: add back-link to optimization workflow Cross-linked non-redundantly: each doc owns one layer, others link to it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 20:50:00 +01:00
Felipe Cardoso	c596e38e9e	Initial commit	2026-03-25 20:13:15 +01:00

7 Commits