2024/05/18
38 分鐘

Pre-training LLMs: One Model To Rule Them All? with Talfan Evans, DeepMind

Talfan Evans is a research engineer at DeepMind, where he focuses on data curation and foundational research for pre-training LLMs and multimodal models like Gemini. I ask Talfan:

Will one model rule them all?
What does "high quality data" actually mean in the context of LLM training?
Is language model pre-training becoming commoditized?
Are companies like Google and OpenAI keeping their AI secrets to themselves?
Does the startup or open source community stand a chance next to the giants?

Also check out Talfan's latest paper at DeepMind, Bad Students Make Good Teachers.

單集網頁

節目

Thinking Machines: AI & Philosophy
頻率

每週更新兩次
發佈時間

2024年5月18日下午3:33 [UTC]
長度

38 分鐘
年齡分級

兒少適宜

Pre-training LLMs: One Model To Rule Them All? with Talfan Evans, DeepMind

資訊