Commit Graph

  • c847991740 docs: add agentic coding evaluation landscape research main Felipe Cardoso 2026-04-15 15:55:04 +02:00
  • 15bb6a8ed9 feat(serve): set APEX I-Compact as default, harden benchmark workflow Felipe Cardoso 2026-04-13 01:11:46 +02:00
  • 474d94a07e chore: update model catalog with gemma 4, opus distill, and hw-bandwidth target Felipe Cardoso 2026-04-03 20:03:53 +02:00
  • 6ab08537ca fix: address code review findings — batch args, venv path, serve flags Felipe Cardoso 2026-03-31 10:10:48 +02:00
  • dd403a907c feat(serve): add optimized llama-server launcher with n-gram speculation Felipe Cardoso 2026-03-30 21:12:30 +02:00
  • ba24091791 feat(benchmark): add -b/--batch flag, test MoE batch size impact Felipe Cardoso 2026-03-30 20:01:24 +02:00
  • ea70687cd2 docs: update optimization guide with measured hardware data Felipe Cardoso 2026-03-30 19:56:18 +02:00
  • 1549bc27c0 feat(optimize): add Phase 2 power profile and system tuning Felipe Cardoso 2026-03-30 18:53:52 +02:00
  • f92b710492 fix(benchmark): parse llama-bench output with variable column count Felipe Cardoso 2026-03-27 14:54:19 +01:00
  • 7531f6fa74 feat(benchmark): add --kv-types flag for KV cache quantization sweep Felipe Cardoso 2026-03-27 12:29:19 +01:00
  • 38daf953bf feat: add --pp and --tg flags for realistic benchmark workloads Felipe Cardoso 2026-03-26 22:48:32 +01:00
  • 3686783f4d feat: add --context flag for configurable long-context benchmarks Felipe Cardoso 2026-03-26 22:46:16 +01:00
  • 1b5b193e81 fix: suppress exit code 143 from metric logger cleanup Felipe Cardoso 2026-03-26 22:38:48 +01:00
  • fb1e57f1bf feat: make llama-rocm-7.2 a required toolbox in benchmark setup Felipe Cardoso 2026-03-26 19:23:03 +01:00
  • 7c8be55bfe fix: resolve model paths for toolbox container access Felipe Cardoso 2026-03-26 19:17:16 +01:00
  • d22c062ca7 fix: model catalog shows download status, GPU detection in toolbox Felipe Cardoso 2026-03-26 19:14:31 +01:00
  • 6f197a1455 fix: pass ARGS through in benchmark Makefile targets Felipe Cardoso 2026-03-26 19:10:59 +01:00
  • cb25fa3f6f feat: add benchmark filtering (--max-size, --category, --skip-longctx) Felipe Cardoso 2026-03-26 19:07:24 +01:00
  • eb52ea52ce fix: follow symlinks in model discovery, update model catalog Felipe Cardoso 2026-03-26 09:44:16 +01:00
  • 58124cd657 feat: add Qwen3.5 model catalog and agentic evaluation framework Felipe Cardoso 2026-03-26 00:20:23 +01:00
  • 71053997be chore: remove .idea from tracking, add to .gitignore Felipe Cardoso 2026-03-25 23:58:18 +01:00
  • e9cb5c491f fix+test: improve test suite, fix 2 bugs found by tests Felipe Cardoso 2026-03-25 22:22:41 +01:00
  • a403dd9ce0 test: add BATS test suite (79 tests) Felipe Cardoso 2026-03-25 22:15:34 +01:00
  • da2c4c6b8a fix(docs): address review findings — accuracy, consistency, completeness Felipe Cardoso 2026-03-25 21:44:16 +01:00
  • 5b81437637 docs: add README, CLAUDE.md, AGENTS.md, and full docs/ suite Felipe Cardoso 2026-03-25 20:50:00 +01:00
  • af0515d05d fix: address code review findings (HIGH + MEDIUM) Felipe Cardoso 2026-03-25 20:19:44 +01:00
  • c596e38e9e Initial commit Felipe Cardoso 2026-03-25 20:13:15 +01:00