Throughput

Appears in 1 paper

The number of queries a system can handle per unit time.

As used in Paper 23 — Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Model Parameters →

The number of queries a system can handle per unit time. Increasing TTC per query typically decreases throughput (fewer queries per second) because each query takes longer.