Training-Time Compute
Computation used to train the model initially.
Computation used to train the model initially. Larger models require more training compute. Training-time scale has historically been the primary way to improve model capability, but test-time compute offers a complementary path.