Both run-baseline.sh and run-suite.sh now accept --context N to set
the long-context depth (default: 32768). Prompt tokens auto-scale to
~1/16 of context depth for larger windows.
Examples:
benchmark run --tag ctx64k --context 65536 --category moe
benchmark run --tag ctx128k --context 131072 --category moe --reps 3
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>