Towards Reliable Evaluation of Large Language Models (LLMs)

Скачать Towards Reliable Evaluation of Large Language Models (LLMs) бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно Towards Reliable Evaluation of Large Language Models (LLMs) или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Cкачать музыку Towards Reliable Evaluation of Large Language Models (LLMs) бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео Towards Reliable Evaluation of Large Language Models (LLMs)

Large Language Models (LLMs) have become ubiquitous in today's technology landscape due to their remarkable ability to "understand" and generate human-like text. They are used in a wide array of applications from chatbots to content creation. But how do you properly evaluate the quality of such models?
This talk gives an overview of current approaches to evaluating LLMs and their respective shortcomings. We then present a statistical framework, developed by researchers at the ZHAW Datalab, to determine how reliable an evaluation method is, and how much data - human-annotated vs. automatically generated - is needed. We then show how this framework can be used to implement trustworthy real-world evaluation settings for LLMs.

Speaker:

Mark Cieliebak
Professor
Zurich University of Applied Sciences

Комментарии

Информация по комментариям в разработке