1 時間53分

Combining Vision & Language in AI perception and the era of LLMs & LMMs | Dr. Yezhou Yang Jay Shah Podcast

- 科学

Dr. Yezhou Yang is an Associate Professor at Arizona State University and director of the Active Perception Group at ASU. He has research interests in Cognitive Robotics and Computer Vision, and understanding human actions from visual input and grounding them by natural language. Prior to joining ASU, he completed his Ph.D. from the University of Maryland and his postdoctoral at the Computer Vision Lab and Perception and Robotics Lab.

Timestamps of the conversation
00:01:02 Introduction
00:01:46 Interest in AI
00:17:04 Entry in Robotics & AI Perception
00:20:59 Combining Vision & language to Improve Robot Perception
00:23:30 End-to-end learning vs traditional knowledge graphs
00:28:28 What do LLMs learn?
00:30:30 Nature of AI research
00:36:00 Why vision & language in AI?
00:45:40 Learning vs Reasoning in neural networks
00:53:05 Bringing AI to the general crowd
01:00:10 Transformers in Vision
01:08:54 Democratization of AI
01:13:42 Motivation for research: theory or application?
01:18:50 Surpassing human intelligence
01:25:13 Open challenges in computer vision research
01:30:19 Doing research is a privilege
01:35:00 Rejections, tips to read & write good papers
01:43:37 Tips for AI Enthusiasts
01:47:35 What is a good research problem?
01:50:30 Dos and Don'ts in AI research

More about Dr. Yang: https://yezhouyang.engineering.asu.edu/
And his Twitter handle: https://twitter.com/Yezhou_Yang

About the Host:
Jay is a PhD student at Arizona State University.
Linkedin: https://www.linkedin.com/in/shahjay22/
Twitter: https://twitter.com/jaygshah22
Homepage: https://www.public.asu.edu/~jgshah1/ for any queries.

Check-out Rora: https://teamrora.com/jayshah
Guide to STEM PhD AI Researcher + Research Scientist pay: https://www.teamrora.com/post/ai-researchers-salary-negotiation-report-2023

Stay tuned for upcoming webinars!

***Disclaimer: The information contained in this video represents the views and opinions of the speaker and does not necessarily represent the views or opinions of any institution. It does not constitute an endorsement by any Institution or its affiliates of such video content.***