9月29日
第 1 季第 6 集
19 分鐘

Kaito: AI/ML inference & tuning in Kubernetes: with Ernest Wong: Upstream@AKS: Azure Kubernetes Svc

Lachie Evenson and Ernest Wong discuss Kaito.

KAITO is an operator that automates the AI/ML model inference or tuning workload in a Kubernetes cluster. The target models are popular open-sourced large models such as phi-4 and llama.Related LinksCNCF: https://www.cncf.io/KAITO: https://github.com/kaito-project/kaitoGet involved with KAITO: https://github.com/kaito-project/kaito?tab=readme-ov-file#get-involvedPackaging models as a container image: https://kaito-project.github.io/kaito/docs/model-as-oci-artifactsKAITO preset models: https://kaito-project.github.io/kaito/docs/presetsGPU Operator: https://github.com/NVIDIA/gpu-operatorRetrieval-Augmented Generation (RAG): https://kaito-project.github.io/kaito/docs/rag/

單集網頁

節目

Upstream @ AKS
頻率

每週更新
發佈時間

2025年9月29日上午11:00 [UTC]
長度

19 分鐘
季數

1
集數

6
年齡分級

兒少適宜

Kaito: AI/ML inference & tuning in Kubernetes: with Ernest Wong: Upstream@AKS: Azure Kubernetes Svc

資訊