Accuracy / Performance Metric
In the paper, accuracy is the percentage of problems solved correctly out of a total.
In the paper, accuracy is the percentage of problems solved correctly out of a total. Example: on GSM8K, PaLM achieved 25% accuracy with standard prompting and 58% with CoT. The improvement of 33 percentage points is considered massive in the field (>2× relative improvement). Accuracy is measured on held-out test sets where the model has not seen the specific problems during training.