Confidence-Reward Driven Preference Optimization for Machine Translation

Neural intel Pod

The paper "CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation" introduces a novel approach to improving machine translation (MT) performance by leveraging both reward scores and model confidence for data selection during fine-tuning.

To listen to explicit episodes, sign in.

Stay up to date with this show

Sign in or sign up to follow shows, save episodes, and get the latest updates.

Select a country or region

Africa, Middle East, and India

Asia Pacific

Europe

Latin America and the Caribbean

The United States and Canada