Felipe Cardoso
15bb6a8ed9
feat(serve): set APEX I-Compact as default, harden benchmark workflow
Serving:
- make serve now launches Claude-distilled APEX 35B-A3B (16GB) with 2
parallel slots and 256K context as the daily driver
- add serve-custom for ad-hoc model testing
- add flush-gpu to reclaim unified memory after stuck runs
Benchmarks:
- default Vulkan-only backends (ROCm trails at long context)
- add --backends filter to run-baseline.sh
- fix backend filter substring bug (grep -qFx for exact line match)
- fix model filter regex metacharacter bug (grep -qiF for literal)
- respect --tg in long-context tests instead of hardcoded n=32
ROCm bump to 7.2.1 (kernel 6.18.4+ patch); keep 7.2 as optional.
Catalog:
- add mudler APEX I-Compact (Claude-distilled 35B, 17GB)
- add 0xSero REAP-40 (pruned 122B-A10B, 46GB)
- update download instructions: hf download (huggingface-cli is gone)
2026-04-13 01:11:46 +02:00
..
2026-03-26 00:20:23 +01:00
2026-03-25 21:44:16 +01:00
2026-04-13 01:11:46 +02:00
2026-03-25 20:50:00 +01:00
2026-03-27 14:54:19 +01:00
2026-03-27 14:54:19 +01:00
2026-03-26 00:20:23 +01:00
2026-03-30 21:12:30 +02:00
2026-03-30 19:56:18 +02:00
2026-03-27 14:54:19 +01:00
2026-03-25 21:44:16 +01:00