[CLS] token

Appears in 1 paper

A special token prepended to every BERT input.

As used in Paper 11 — BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding →

A special token prepended to every BERT input. Its final hidden state is a single vector summarising the entire sequence. Used as input to the classifier head for sentence-level tasks (sentiment, NSP, entailment).