334 episodios

MLOps.community Demetrios Brinkmann

- Tecnología

Weekly talks and fireside chats about everything that has to do with the new space emerging around DevOps for Machine Learning aka MLOps aka Machine Learning Operations.

- 17 MAY 2024
Retrieval Augmented Generation

Retrieval Augmented Generation

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/

Syed Asad is an Innovator, Generative AI & Machine Learning Engineer, and a Champion for Ethical AI
MLOps podcast #233 with Syed Asad, Lead AI/ML Engineer at KiwiTech // Retrieval Augmented Generation.

A big thank you to @ for sponsoring this episode! AWS -

// Abstract
Everything and anything around RAG.

// Bio
Currently Exploring New Horizons:
Syed is diving deep into the exciting world of Semantic Vector Searches and Vector Databases. These innovative technologies are reshaping how we interact with and interpret vast data landscapes, opening new avenues for discovery and innovation.

Specializing in Retrieval Augmented Generation (RAG):
Syed's current focus also includes mastering Retrieval Augmented Generation Techniques (RAGs). This cutting-edge approach combines the power of information retrieval with generative models, setting new benchmarks in AI's capability and application.

// MLOps Jobs board
https://mlops.pallet.xyz/jobs

// MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links
Website: https://sanketgupta.substack.com/
Our paper on this topic "Generalized User Representations for Transfer Learning": https://arxiv.org/abs/2403.00584
Sanket's blogs on Medium in the past: https://medium.com/@sanket107

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Syed on LinkedIn: https://www.linkedin.com/in/syed-asad-76815246/
- 44 min
- 16 MAY 2024
RecSys at Spotify // Sanket Gupta // #232

RecSys at Spotify // Sanket Gupta // #232

Sanket works as a Senior Machine Learning Engineer at Spotify working on building end-to-end audio recommender systems. Models built by his team are used across Spotify in many different products including Discover Weekly and Autoplay.

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/

MLOps podcast #232 with Sanket Gupta, Senior Machine Learning Engineer at Spotify //
RecSys at Spotify.

A big thank you to LatticeFlow for sponsoring this episode! LatticeFlow - https://latticeflow.ai/

// Abstract
LLMs with foundational embeddings have changed the way we approach AI today. Instead of re-training models from scratch end-to-end, we instead rely on fine-tuning existing foundation models to perform transfer learning.
Is there a similar approach we can take with recommender systems?
In this episode, we can talk about:
a) how Spotify builds and maintains large-scale recommender systems,
b) how foundational user and item embeddings can enable transfer learning across multiple products,
c) how we evaluate this system
d) MLOps challenges with these systems

// Bio
Sanket works as a Senior Machine Learning Engineer on a team at Spotify building production-grade recommender systems. Models built by my team are being used in Autoplay, Daily Mix, Discover Weekly, etc.
Currently, my passion is how to build systems to understand user taste - how do we balance long-term and short-term understanding of users to enable a great personalized experience.

// MLOps Jobs board
https://mlops.pallet.xyz/jobs

// MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links
Website: https://sanketgupta.substack.com/
Our paper on this topic "Generalized User Representations for Transfer Learning": https://arxiv.org/abs/2403.00584
Sanket's blogs on Medium in the past: https://medium.com/@sanket107

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Sanket on LinkedIn: www.linkedin.com/in/sanketgupta107
- 50 min
- 10 MAY 2024
From A Coding Startup to AI Development in the Enterprise // Ryan Carson // #231

From A Coding Startup to AI Development in the Enterprise // Ryan Carson // #231

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/

Ryan Carson. CEO Founder for 20 years Built and sold 3 startups Helping build a global community of AI devs with Intel.

MLOps podcast #231 with Ryan Carson, Senior AI Dev Community Lead at Intel

Huge thank you to Zilliz for sponsoring this episode. Zilliz - https://zilliz.com/

// Abstract
Ryan shares his professional journey, tracing his transition from building Treehouse to joining Intel. The conversation evolves into a deep dive into Carson's aspiration to democratize access to AI development. Furthermore, he expounds on the exciting prospects of new technology like Gaudi three, a new ASIC for AI workloads. Ryan emphasizes the need for driving competition in compute to lower prices and increase access, underlining the importance of associating individual work with company-based OKRs or KPIs. There is also a reflection on the essentiality of forging quality relationships in professional settings and aligning work with top-level OKRs. Discussion on the potential benefits of AI in constructing and maintaining professional interactions is explored. Touching upon practical applications of AI, they also delve into smaller projects, the possibility of one-person companies, and the role of AI for daily interactions. The episode concludes with an expression of optimism about technological advances shaping the future and an appreciation for the enlightening conversation.

// Bio
Ryan has been a founder, entrepreneur, and CEO for 20 years, successfully building, scaling, and selling three companies. He's passionate about empowering people to become developers and then connecting them together in a global community.

After earning a degree in Computer Science in Colorado, Ryan moved to the UK and worked as a web developer. He then organized global tech conferences, hosting thousands of attendees and influential speakers such as Mark Zuckerberg, the founders of Android, Instagram, and Twitter, among others. His company also produced Twitter’s and Stack Overflow’s developer conferences.

Following that, Ryan started an online Computer Science school. Under his leadership, the team grew to over 100 employees, educating more than 1,000,000 students. During this period, he secured $23 million in venture capital and earned recognition as Entrepreneur of the Year.

Over the last two years Ryan dove deep into AI and LLMs. He built an educational proof-of-concept called maple.coach, which focuses on teaching Sales. The platform is built using technologies like Next.js, TypeScript, gpt-4, and Vercel.

Outside of work, Ryan shares his life with his wife of 20 years and their two teenagers in Connecticut. They enjoy spending their free time sailing and taking walks with their Sheltie, Brinkley.

// MLOps Jobs board
https://mlops.pallet.xyz/jobs

// MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links
Website: ryancarson.com

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Ryan on LinkedIn: https://www.linkedin.com/in/ryancarson/
- 58 min
- 7 MAY 2024
FedML Nexus AI: Your Generative AI Platform at Scale // Salman Avestimehr // #230

FedML Nexus AI: Your Generative AI Platform at Scale // Salman Avestimehr // #230

Salman Avestimehr is a Dean's Professor, the inaugural director of the USC-Amazon Center for Secure and Trusted Machine Learning (Trusted AI), and director of the Information Theory and Machine Learning (vITAL) research lab. He is also the CEO and co-founder of FedML.

MLOps podcast #230 with Salman Avestimehr, CEO & Founder of FedML, FedML Nexus AI: Your Generative AI Platform at Scale.

A big thank you to FEDML for sponsoring this episode!

// Abstract
FedML is your generative AI platform at scale to enable developers and enterprises to build and commercialize their own generative AI applications easily, scalably, and economically. Its flagship product, FedML Nexus AI, provides unique features in enterprise AI platforms, model deployment, model serving, AI agent APIs, launching training/Inference jobs on serverless/decentralized GPU cloud, experimental tracking for distributed training, federated learning, security, and privacy.

// Bio
Salman is a professor, the inaugural director of the USC-Amazon Center for Secure and Trusted Machine Learning (Trusted AI), and the director of the Information Theory and Machine Learning (vITAL) research lab at the Electrical and Computer Engineering Department and Computer Science Department of the University of Southern California. Salman is also the co-founder and CEO of FedML. He received his Ph.D. in Electrical Engineering and Computer Sciences from UC Berkeley in 2008. Salman does research in the areas of information theory, decentralized and federated machine learning, secure and privacy-preserving learning, and computing.

// MLOps Jobs board
https://mlops.pallet.xyz/jobs

// MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links
https://www.avestimehr.com/
https://fedml.ai/

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Salman on LinkedIn: https://www.linkedin.com/company/fedml/

Timestamps:
[00:00] AI Quality: First in-person conference on June 25
[01:28] Salman's preferred coffee
[01:49] Takeaways
[03:33] Please like, share, leave a review, and subscribe to our MLOps channels!
[03:53] Challenges that inspired Salman's work
[06:20] Controlled ownership
[08:11] Dealing with data leakage and privacy problems
[10:45] In-house ML Model Deployment
[13:36] FEDML: Comprehensive Model Deployment
[17:27] Integrating FEDML with Kubernetes
[19:46] AI Evaluation Trends
[24:37] Enhancing NLP with ML
[25:48] FEDML: Canary, A/B, Confidence
[29:36] FEDML customers
[33:21] On-premise platform for secure data management

[37:16] Future prediction: data's crucial for better applications

[38:18] Maturity in evaluating and improving steps

[41:38] Focus on ownership

[45:12] Benefits of smaller models for specific use cases

[48:57] Verify sensitive tasks, trust quick, important mobile content creation

[51:50] Wrap up
- 52 min
- 3 MAY 2024
What is AI Quality? // Mohamed Elgendy // #228

What is AI Quality? // Mohamed Elgendy // #228

Mohamed Elgendy is the Co-Founder & CEO at Kolena. Additionally, Mohamed Elgendy has had 1 past job as the Director Of Product and Engineering at Synapse Technology Corporation.

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/

MLOps podcast #228 with Mohamed Elgendy, Co-founder & CEO of Kolena Inc., What is AI Quality?

// Abstract
Delve into the multifaceted concept of AI Quality. Demetrios and Mo explore the idea that AI quality is dependent on the specific domain, equitable to the difference in desired qualities between a $1 pen and a $100 pen. Mo underscores the performance of a product being in sync with its intended functionality and the absence of unknown risks as the pillars of AI Quality. They emphasize the need for comprehensive quality checks and adaptability of standards to differing product traits. Issues affecting edge deployments like latency are also highlighted. A deep dive into the formation of gold standards for AI, the nuanced necessities for various use cases, and the paramount need for collaboration among AI builders, regulators, and infrastructure firms form the core of the discussion. Elgendy brings to light their ambitious AI Quality Conference, aiming to set tangible, effective, but innovation-friendly Quality standards for AI. The dialogue also accentuates the urgent need for diversification and representation in the tech industry, the variability of standards and regulations, and the pivotal role of testing in AI and machine learning. The episode concludes with an articulate portrayal of how enhanced testing can streamline the entire process of machine learning.

// Bio
Mohamed is the Co-founder & CEO of Kolena and the author of the book “Deep Learning for Vision Systems”. Previously, he built and managed AI/ML organizations at Amazon, Twilio, Rakuten, and Synapse. Mohamed regularly speaks at AI conferences like Amazon's DevCon, O'Reilly's AI conference, and Google's I/O.

// MLOps Jobs board
https://mlops.pallet.xyz/jobs

// MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links
Website: www.kolena.io
Deep Learning for Vision Systems book: https://www.amazon.com/Learning-Vision-Systems-Mohamed-Elgendy/dp/1617296198/

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Mo on LinkedIn: https://www.linkedin.com/in/moelgendy/

Timestamps:
[00:00] Mo's preferred coffee
[00:07] Takeaways
[02:52] See you all in San Francisco on June 25!
[03:04] Please like, share, leave a review, and subscribe to our MLOps channels!
[03:22] AI Quality in Mo's eyes
[08:36] Quality Standards for Software
[14:11] Common Chatbot Functionality
[19:20] The Birth of Innovation
[24:27] Transforming Insights into Standards
[30:27] Testing: One step to quality
[34:58] Two different data points to be harmonized
[37:29] Model cards
[39:12] Test Coverage Democratizes Collaboration
[42:55] Representation matters
[44:50] Wrap up
- 45 min
- 30 ABR 2024
Handling Multi-Terabyte LLM Checkpoints // Simon Karasik // #228

Handling Multi-Terabyte LLM Checkpoints // Simon Karasik // #228

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com

Simon Karasik⁠ is a proactive and curious ML Engineer with 5 years of experience. Developed & deployed ML models at WEB and Big scale for Ads and Tax.

Huge thank you to Nebius AI for sponsoring this episode. Nebius AI - https://nebius.ai/

MLOps podcast #228 with Simon Karasik, Machine Learning Engineer at Nebius AI, Handling Multi-Terabyte LLM Checkpoints.

// Abstract
The talk provides a gentle introduction to the topic of LLM checkpointing: why is it hard, how big are the checkpoints. It covers various tips and tricks for saving and loading multi-terabyte checkpoints, as well as the selection of cloud storage options for checkpointing.

// Bio
Full-stack Machine Learning Engineer, currently working on infrastructure for LLM training, with previous experience in ML for Ads, Speech, and Tax.

// MLOps Jobs board
https://mlops.pallet.xyz/jobs

// MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Simon on LinkedIn: https://www.linkedin.com/in/simon-karasik/

Timestamps:
[00:00] Simon preferred beverage
[01:23] Takeaways
[04:22] Simon's tech background
[08:42] Zombie models garbage collection
[10:52] The road to LLMs
[15:09] Trained models Simon worked on
[16:26] LLM Checkpoints
[20:36] Confidence in AI Training
[22:07] Different Checkpoints
[25:06] Checkpoint parts
[29:05] Slurm vs Kubernetes
[30:43] Storage choices lessons
[36:02] Paramount components for setup
[37:13] Argo workflows
[39:49] Kubernetes node troubleshooting
[42:35] Cloud virtual machines have pre-installed mentoring
[45:41] Fine-tuning
[48:16] Storage, networking, and complexity in network design
[50:56] Start simple before advanced; consider model needs.
[53:58] Join us at our first in-person conference on June 25 all about AI Quality
- 55 min