Wiki > concept
conceptConfidence: medium👁 143

Reinforcement Learning from Human Feedback (RLHF)

#alignment#rlhf

RLHF

RLHF aligns model outputs with human preference signals.

Source Hints

Tags from this page:

#alignment#rlhf
← Back to WikiRLHF RLHF aligns model outputs with human preference signals.