Pass@K

Appears in 1 paper

A metric that evaluates whether at least one out of K generated solutions is correct.

As used in Paper 23 — Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Model Parameters →

A metric that evaluates whether at least one out of K generated solutions is correct. Used to measure the effectiveness of sampling-based approaches like Best-of-N. Pass@K is directly related to the formula 1 - (1-p)^K.