Files

Felipe Cardoso 58124cd657 feat: add Qwen3.5 model catalog and agentic evaluation framework

Models:
- configs/models.conf: catalog with Qwen3.5-35B-A3B (MoE, top pick),
  Qwen3.5-27B (dense), Qwen3-Coder-30B-A3B (agentic/coding)
- Updated benchmark setup to show catalog with download status
- docs/model-recommendations.md: memory planning, quantization guide

Agentic evaluation:
- scripts/agentic/setup.sh: installs inspect-ai, evalplus, bigcodebench
  in a Python venv
- scripts/agentic/run-eval.sh: runs evaluations against local LLM server
  (ollama or llama.cpp). Suites: quick (HumanEval+IFEval), code
  (EvalPlus+BigCodeBench), tooluse (BFCL), full (all)
- bin/agentic: dispatcher with help
- docs/agentic-benchmarks.md: methodology, framework comparison, model
  recommendations for agentic use

Updated: Makefile (6 new targets), README, CLAUDE.md, docs/references.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 00:20:23 +01:00

4.2 KiB

Raw Blame History

External References

Single source of truth for all external links used across this project.

AMD Official

ROCm Strix Halo Optimization Guide — BIOS, kernel params, GTT/TTM configuration
ROCm System Optimization Index — General ROCm tuning
ROCm Installation Guide (Linux) — Package installation
AMD SMI Documentation — GPU monitoring API
ROCm GitHub — Source and issue tracker

Strix Halo Toolboxes (Donato Capitella)

The most comprehensive community resource for Strix Halo LLM optimization.

strix-halo-toolboxes.com — Documentation, benchmarks, guides
GitHub: kyuz0/amd-strix-halo-toolboxes — Container images, benchmark scripts, VRAM estimator
Benchmark Results Viewer — Interactive performance charts

Community

Strix Halo Wiki — AI Capabilities — Community benchmarks, model compatibility
Level1Techs Forum — HP G1a Guide — Laptop-specific configuration
Framework Community — GPU Performance Tests — Framework Desktop results
LLM Tracker — Strix Halo — Centralized performance database

Other Strix Halo Repos

pablo-ross/strix-halo-gmktec-evo-x2 — GMKtec EVO X2 optimization
kyuz0/amd-strix-halo-llm-finetuning — Fine-tuning guides (Gemma-3, Qwen-3)

Monitoring Tools

amdgpu_top — Best AMD GPU monitor (TUI/GUI/JSON)
nvtop — Cross-vendor GPU monitor
btop — System resource monitor

LLM Inference

llama.cpp — LLM inference engine (Vulkan + ROCm)
ollama — LLM runtime with model management
vLLM — High-throughput serving
llama-benchy — Multi-backend LLM benchmarking

Qwen3.5 Models (GGUF)

unsloth/Qwen3.5-35B-A3B-GGUF — Top pick for 64GB Strix Halo (MoE, 3B active)
unsloth/Qwen3.5-27B-GGUF — Dense 27B
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF — Best for agentic/coding
Qwen3.5 Official — Model family overview
Unsloth Dynamic 2.0 — Adaptive quantization methodology
Unsloth Studio — Training + inference UI (beta)

Agentic Evaluation

Inspect AI — All-in-one eval framework (HumanEval, BFCL, IFEval, GAIA)
EvalPlus — HumanEval+ / MBPP+ with native ollama support
BigCodeBench — 1,140 coding tasks across 139 libraries
BFCL — Berkeley Function Calling Leaderboard
SWE-bench — Real GitHub issue resolution
Qwen-Agent — Optimized agentic framework for Qwen models

AMD GPU Profiling

Radeon GPU Profiler (RGP) — Hardware-level Vulkan/HIP profiling
Radeon GPU Analyzer (RGA) — Offline shader/kernel analysis

4.2 KiB Raw Blame History

External References

AMD Official

Strix Halo Toolboxes (Donato Capitella)

Community

Other Strix Halo Repos

Monitoring Tools

LLM Inference

Qwen3.5 Models (GGUF)

Agentic Evaluation

AMD GPU Profiling

4.2 KiB

Raw Blame History