BPE (Byte Pair Encoding)

Appears in 1 paper

A subword tokenisation algorithm that splits words into common subunits.

As used in Paper 10 — Improving Language Understanding by Generative Pre-Training →

A subword tokenisation algorithm that splits words into common subunits. "unhappiness" → "un" + "happiness" → stored as three tokens. Handles rare words by decomposing them. GPT-1's vocabulary: 40,478 BPE tokens.

Paper 10 — Improving Language Understanding by Generative Pre-Training →

Appears in papers