This paper explains Anthropic’s constitutional AI approach, which is largely an extension on RLHF but with AIs replacing human demonstrators and human evaluators.
Everything in this paper is relevant to this week's learning objectives, and we recommend you read it in its entirety. It summarises limitations with conventional RLHF, explains the constitutional AI approach, shows how it performs, and where future research might be directed.
If you are in a rush, focus on sections 1.2, 3.1, 3.4, 4.1, 6.1, 6.2.
A podcast by BlueDot Impact.
Learn more on the AI Safety Fundamentals website.
المعلومات
- البرنامج
- معدل البثيتم التحديث يوميًا
- تاريخ النشر١٣ محرم ١٤٤٦ هـ في ٧:٠٠ م UTC
- مدة الحلقة١ س ٢ د
- الموسم٣
- الحلقة٢
- التقييمملائم