Next-token prediction

Appears in 1 paper

The pre-training objective: given all previous tokens, predict the probability distribution over the next token.

As used in Paper 10 — Improving Language Understanding by Generative Pre-Training →

The pre-training objective: given all previous tokens, predict the probability distribution over the next token. Equivalent to maximising the log-likelihood of the training text.