Скачать или смотреть Scaling AI inference with open source ft. Brian Stevens | Technically Speaking with Chris Wright

Scaling AI inference with open source ft. Brian Stevens | Technically Speaking with Chris Wright

Скачать Scaling AI inference with open source ft. Brian Stevens | Technically Speaking with Chris Wright бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно Scaling AI inference with open source ft. Brian Stevens | Technically Speaking with Chris Wright или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Информация по загрузке:

Cкачать музыку Scaling AI inference with open source ft. Brian Stevens | Technically Speaking with Chris Wright бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео Scaling AI inference with open source ft. Brian Stevens | Technically Speaking with Chris Wright

How are enterprises re-imagining AI for real-world impact? Chris Wright, Red Hat CTO and SVP Global Engineering sits down with Brian Stevens, Red Hat SVP and AI CTO, to discuss the journey towards production-quality AI inference at scale. They explore the critical role of open source projects like vLLM, the evolution from CPU to GPU optimization, and the parallels between today's AI challenges and the early days of enterprise Linux.

00:00:37 - Brian Stevens on returning to Red Hat & parallels with early Linux
00:02:00 - The path from cloud to AI & the impact of ChatGPT
00:03:58 - Pivoting to GPUs & the rise of vLLM for generative AI
00:05:48 - From CPU sparsification to GPU model compression
00:08:00 - Optimizing for modern GPUs with vLLM
00:11:38 - An ""AI Operating System""? Integrating vLLM with Kubernetes
00:15:31 - vLLM: A common platform for diverse AI hardware & models
00:17:41 - The importance of distributed KV cache for scalable inference
00:22:53 - Inference-time scaling, reasoning, and platform Demands
00:25:10 - Ecosystem & Community: The key to AI's future

Learn More:
Red Hat AI Solutions: https://www.redhat.com/en/products/ai
vLLM Project: https://docs.vllm.ai/
vLLM GitHub: https://github.com/vllm-project/vllm

Follow us:
Chris Wright (LinkedIn):   / chris-wright-b733851
Brian Stevens (LinkedIn):   / brianmarkstevens

What is Technically Speaking?
Technically Speaking taps into emerging technology trends with insights from leading experts across the globe and Red Hat CTO Chris Wright. The series blends deep-dive discussions, tech updates, and creative short-form content, solidifying Red Hat’s role as a pioneer in technology innovation and open source thought leadership.

Want to participate? Leave us a comment if there's a topic or a guest you'd like to see featured.

Watch More Technically Speaking:
YouTube playlist:    • Technically Speaking with Chris Wright
Show Page: https://www.redhat.com/en/technically...
Subscribe to Red Hat's YouTube channel: https://www.youtube.com/redhat/?sub_c...

#RedHat #TechnicallySpeaking #AIInference #vLLM #EnterpriseAI #OpenSource #BrianStevens #PracticalAI #llmd"

Комментарии

Информация по комментариям в разработке