Скачать или смотреть Optimization and Practice of Large Language Model Inference Systems | SSSC

Optimization and Practice of Large Language Model Inference Systems | SSSC

Скачать Optimization and Practice of Large Language Model Inference Systems | SSSC бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно Optimization and Practice of Large Language Model Inference Systems | SSSC или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Информация по загрузке:

Cкачать музыку Optimization and Practice of Large Language Model Inference Systems | SSSC бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео Optimization and Practice of Large Language Model Inference Systems | SSSC

Speaker: Xiaozhe Yao
PhD Candidate, Department of Computer Science, ETH Zürich

Xiaozhe Yao is a PhD student in the Department of Computer Science at ETH Zürich and holds a Master’s degree from the University of Zurich. His research focuses on achieving more efficient machine learning systems through innovations in system design, data organization, and data quality management. His work aims to better coordinate the interplay between computational efficiency, data management, and algorithmic innovation in modern ML systems.

Talk: “Optimization and Practice of Large Language Model Inference Systems”

With the rapid adoption of Large Language Models (LLMs) in recent years, efficient inference and deployment have become critical challenges in modern machine learning infrastructure. Due to the auto-regressive generation nature of LLMs, traditional inference methods often face high data movement overhead, low compute utilization, and significant service costs.

Improving inference performance requires system-level optimizations across data flow, parallel computation, and memory management. In this talk, Xiaozhe will introduce various optimization techniques and their applicable scenarios, and share insights from practical exploration on the SwissAI Serving Platform. The session will cover key challenges and opportunities in building scalable, efficient inference and serving systems for large language models.

Комментарии

Информация по комментариям в разработке