
Kaito: AI/ML inference & tuning in Kubernetes: with Ernest Wong: Upstream@AKS: Azure Kubernetes Svc
Lachie Evenson and Ernest Wong discuss Kaito.
KAITO is an operator that automates the AI/ML model inference or tuning workload in a Kubernetes cluster. The target models are popular open-sourced large models such as phi-4 and llama.Related LinksCNCF: https://www.cncf.io/KAITO: https://github.com/kaito-project/kaitoGet involved with KAITO: https://github.com/kaito-project/kaito?tab=readme-ov-file#get-involvedPackaging models as a container image: https://kaito-project.github.io/kaito/docs/model-as-oci-artifactsKAITO preset models: https://kaito-project.github.io/kaito/docs/presetsGPU Operator: https://github.com/NVIDIA/gpu-operatorRetrieval-Augmented Generation (RAG): https://kaito-project.github.io/kaito/docs/rag/
資訊
- 節目
- 頻率每週更新
- 發佈時間2025年9月29日 上午11:00 [UTC]
- 長度19 分鐘
- 季數1
- 集數6
- 年齡分級兒少適宜