strix-halo-optimizations

Author	SHA1	Message	Date
Felipe Cardoso	474d94a07e	chore: update model catalog with gemma 4, opus distill, and hw-bandwidth target	2026-04-03 20:03:53 +02:00
Felipe Cardoso	ea70687cd2	docs: update optimization guide with measured hardware data Replace estimated values with clpeak measurements: DRAM 216-233 GB/s, GPU clocks confirmed 2900 MHz under load (ROCm #5750 is sysfs reporting only). Correct backend recommendation to Vulkan RADV (2.7x faster tg than ROCm at 131K). Update KV cache recommendation to q4_0. Add Nemotron-Cascade-2 to coder shootout results. Remove Nemotron-3-Nano from catalog (replaced by Cascade-2). Update Q4_K_L to Q4_K_XL entry.	2026-03-30 19:56:18 +02:00
Felipe Cardoso	1549bc27c0	feat(optimize): add Phase 2 power profile and system tuning Add `make optimize-power` (ryzenadj 85W, sysctl, THP, RADV nogttspill) with systemd services for boot/resume persistence. Integrate into `make optimize --all` as Phase 2. Update optimization log with RyzenAdj results (+46% tg at 70W sustained), KV sweep data, and quant shootout. Add Qwen3-Coder-30B and Nemotron-Cascade-2 to model catalog.	2026-03-30 18:53:52 +02:00
Felipe Cardoso	eb52ea52ce	fix: follow symlinks in model discovery, update model catalog - Add -L flag to find in benchmark scripts (follows symlinks to /data/models/llms/) - Exclude mmproj-*.gguf (vision projection files, not LLM models) - Update configs/models.conf: remove Qwen3-Coder (user prefers Qwen3.5-35B-A3B), add Qwen3.5-27B-Q4_K_M and Q8_0 variant, reflect actual downloaded models Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 09:44:16 +01:00
Felipe Cardoso	58124cd657	feat: add Qwen3.5 model catalog and agentic evaluation framework Models: - configs/models.conf: catalog with Qwen3.5-35B-A3B (MoE, top pick), Qwen3.5-27B (dense), Qwen3-Coder-30B-A3B (agentic/coding) - Updated benchmark setup to show catalog with download status - docs/model-recommendations.md: memory planning, quantization guide Agentic evaluation: - scripts/agentic/setup.sh: installs inspect-ai, evalplus, bigcodebench in a Python venv - scripts/agentic/run-eval.sh: runs evaluations against local LLM server (ollama or llama.cpp). Suites: quick (HumanEval+IFEval), code (EvalPlus+BigCodeBench), tooluse (BFCL), full (all) - bin/agentic: dispatcher with help - docs/agentic-benchmarks.md: methodology, framework comparison, model recommendations for agentic use Updated: Makefile (6 new targets), README, CLAUDE.md, docs/references.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 00:20:23 +01:00

5 Commits