strix-halo-optimizations/scripts/benchmark/run-baseline.sh at 38daf953bfe7ba2787df298da725d3a7402c58d5

Files

Felipe Cardoso 38daf953bf feat: add --pp and --tg flags for realistic benchmark workloads

Standard benchmarks use pp512/tg128 which underestimates real-world
agentic coding where responses are 500-2000 tokens. Now configurable:

  --pp N    Prompt processing tokens (default: 512)
  --tg N    Token generation count (default: 128)

Examples:
  benchmark run --tag realistic --tg 1024 --pp 2048 --category moe
  benchmark run --tag full-response --tg 2048 --category moe --reps 3

Log filenames include pp/tg when non-default (e.g., model__backend__fa1__pp2048_tg1024.log)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 22:48:32 +01:00

12 KiB

Raw Blame History

View Raw

12 KiB Raw Blame History

12 KiB

Raw Blame History