20 HR. AGO
21 MIN

Confidence-Reward Driven Preference Optimization for Machine Translation

The paper "CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation" introduces a novel approach to improving machine translation (MT) performance by leveraging both reward scores and model confidence for data selection during fine-tuning.

Episode Webpage

Show

Neural intel Pod
Frequency

Updated Daily
Published

February 9, 2025 at 4:49 AM UTC
Length

21 min
Rating

Clean

Confidence-Reward Driven Preference Optimization for Machine Translation

Information