Today we have Philip Kiely from Baseten on the show. Baseten is a Series B startup focused on providing infrastructure for AI workloads.
We go deep on Inference Optimization. We cover choosing a model, discuss the hype around Compound AI, choosing an Inference Engine, Optimization Techniques like Quantization and Speculative Decoding all the way down to your GPU choice.
Information
- Show
- FrequencyUpdated Daily
- PublishedNovember 5, 2024 at 9:14 PM UTC
- Length1h 4m
- RatingClean