Data Annotation / Labeling

Appears in 1 paper

The process of having humans provide labels (e.g., preference comparisons) for training data.

As used in Paper 15 — Training Language Models to Follow Instructions with Human Feedback →

The process of having humans provide labels (e.g., preference comparisons) for training data. A major cost in RLHF. This paper used ~90 human contractors to annotate 13k demonstrations and 33k comparisons. Scaling RLHF requires either more raters or AI-generated feedback (Constitutional AI).