strix-halo-optimizations

cardosofelipe/strix-halo-optimizations

Fork 0

Commit Graph

Author	SHA1	Message	Date
Felipe Cardoso	f92b710492	fix(benchmark): parse llama-bench output with variable column count KV cache quantization adds type_k/type_v columns to llama-bench output, shifting test and t/s to different indices. Parse from end of row instead of hardcoded positions. Also fix KV suffix separator (underscore to dash) to avoid regex ambiguity with type names like q8_0. Add 5-phase optimization guide, optimization log for tracking results, and research docs on llama.cpp and inference landscape optimizations.	2026-03-27 14:54:19 +01:00

Author

SHA1

Message

Date

Felipe Cardoso

f92b710492

fix(benchmark): parse llama-bench output with variable column count

KV cache quantization adds type_k/type_v columns to llama-bench output,
shifting test and t/s to different indices. Parse from end of row instead
of hardcoded positions. Also fix KV suffix separator (underscore to dash)
to avoid regex ambiguity with type names like q8_0.

Add 5-phase optimization guide, optimization log for tracking results,
and research docs on llama.cpp and inference landscape optimizations.

2026-03-27 14:54:19 +01:00

1 Commits