Best-of-N (BoN)
A strategy where you generate N independent solutions to the same problem and select the best one according to some criterion (e.g., a Process Reward Model score).
A strategy where you generate N independent solutions to the same problem and select the best one according to some criterion (e.g., a Process Reward Model score). The probability of at least one correct solution is 1 - (1-p)^N, where p is the base success rate.