Context vector

Appears in 2 papers

Also called the **thought vector**.

As used in Paper 06 — Sequence to Sequence Learning with Neural Networks →

Also called the thought vector. In a standard seq2seq model, it is

As used in Paper 07 — Neural Machine Translation by Jointly Learning to Align and Translate →

A vector computed at each decoding step t as the attention-weighted sum of all encoder hidden states: cₜ = Σᵢ αₜᵢ hᵢ. Unlike the seq2seq context vector (fixed), this is different at every decoding step — it is a dynamic, query-dependent read of the source.