feat(optimize): add Phase 2 power profile and system tuning
Add `make optimize-power` (ryzenadj 85W, sysctl, THP, RADV nogttspill) with systemd services for boot/resume persistence. Integrate into `make optimize --all` as Phase 2. Update optimization log with RyzenAdj results (+46% tg at 70W sustained), KV sweep data, and quant shootout. Add Qwen3-Coder-30B and Nemotron-Cascade-2 to model catalog.
This commit is contained in:
@@ -21,3 +21,10 @@ qwen3.5-27b-q4|unsloth/Qwen3.5-27B-GGUF|Qwen3.5-27B-Q4_K_M.gguf|17|dense|Dense 2
|
||||
qwen3.5-35b-a3b-q4|unsloth/Qwen3.5-35B-A3B-GGUF|Qwen3.5-35B-A3B-UD-Q4_K_L.gguf|19|moe|MoE 35B, 3B active, Unsloth dynamic
|
||||
qwen3.5-35b-a3b-q8|unsloth/Qwen3.5-35B-A3B-GGUF|Qwen3.5-35B-A3B-Q8_0.gguf|37|moe|MoE 35B Q8, near-full precision
|
||||
nemotron-30b-a3b-q4|lmstudio-community/NVIDIA-Nemotron-3-Nano-30B-A3B-GGUF|NVIDIA-Nemotron-3-Nano-30B-A3B-Q4_K_M.gguf|23|moe|Nemotron MoE 30B, 3B active
|
||||
nemotron-cascade2-q8|bartowski/nvidia_Nemotron-Cascade-2-30B-A3B-GGUF|nvidia_Nemotron-Cascade-2-30B-A3B-Q8_0.gguf|31|moe|Nemotron Cascade 2, Mamba-2 hybrid
|
||||
|
||||
# ── Coding models ─────────────────────────────────────────
|
||||
qwen3-coder-30b-a3b-q6|unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF|Qwen3-Coder-30B-A3B-Instruct-UD-Q6_K_XL.gguf|26|moe|Agentic coding MoE, pure Transformer
|
||||
|
||||
# ── Draft models (speculative decoding) ───────────────────
|
||||
qwen3.5-0.8b-q8-draft|unsloth/Qwen3.5-0.8B-GGUF|Qwen3.5-0.8B-Q8_0.gguf|0.8|draft|Draft for Qwen3.5 speculative decoding
|
||||
|
||||
14
configs/ryzenadj-llm.service
Normal file
14
configs/ryzenadj-llm.service
Normal file
@@ -0,0 +1,14 @@
|
||||
[Unit]
|
||||
Description=Apply RyzenAdj power limits for LLM inference
|
||||
After=multi-user.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/usr/local/bin/ryzenadj --stapm-limit=85000 --fast-limit=85000 --slow-limit=85000 --apu-slow-limit=85000
|
||||
RemainAfterExit=yes
|
||||
|
||||
# Re-apply after resume from sleep/hibernate (HP firmware resets limits)
|
||||
ExecStartPost=/bin/sleep 2
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
10
configs/ryzenadj-resume.service
Normal file
10
configs/ryzenadj-resume.service
Normal file
@@ -0,0 +1,10 @@
|
||||
[Unit]
|
||||
Description=Re-apply RyzenAdj power limits after resume
|
||||
After=suspend.target hibernate.target hybrid-sleep.target suspend-then-hibernate.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/usr/local/bin/ryzenadj --stapm-limit=85000 --fast-limit=85000 --slow-limit=85000 --apu-slow-limit=85000
|
||||
|
||||
[Install]
|
||||
WantedBy=suspend.target hibernate.target hybrid-sleep.target suspend-then-hibernate.target
|
||||
Reference in New Issue
Block a user