Instruction Following / Alignment

Appears in 1 paper

Teaching language models to follow user instructions accurately and safely.

As used in Paper 14 — Chain-of-Thought Prompting Elicits Reasoning in Large Language Models →

Teaching language models to follow user instructions accurately and safely. CoT later became important for instruction-following systems like InstructGPT and ChatGPT, where showing reasoning steps helps the model follow complex multi-step instructions. RLHF reward models increasingly gave higher scores to outputs with explicit reasoning chains.