Towards Cost-Efficient Use of Pre-trained Models

Описание к видео Towards Cost-Efficient Use of Pre-trained Models

Abstract: Large language models are leading to many exciting breakthroughs, but this comes at a significant cost in terms of both computational and data labeling expenses. Training state-of-the-art models requires access to high-end GPUs for pre-training and inference, in addition to labeled data for fine-tuning. In this talk I will examine the tradeoff between these costs, with the goal of supporting better decisions. Conventional wisdom holds that annotating data is expensive, so computational methods that use unlabeled data to improve performance can present an economical alternative. I will examine this assumption in the context of pretraining-based adaptation, which requires significant computation for each new domain. As a second example where the tradeoff between computation and annotation arises, I will show that training and then distilling large models can be an economical strategy for improving performance. Finally, I will discuss applications on chemical synthesis protocols, and show a demo of a system that can help chemists to more efficiently find experimental conditions described in the literature. I will also present a new approach to extracting data from tables in scientific articles where the only supervision provided to the model is a database schema, eliminating the need for labeled data or custom extraction pipelines.

Bio: Alan Ritter is an associate professor in the College of Computing at Georgia Tech. His research on natural language processing aims to solve technical challenges that help machines read the web and engage in safe and helpful dialogue with people. In a recent project, covered by WIRED (https://www.wired.com/story/machine-l..., Alan's group built a system that reads millions of online messages for mentions of new software vulnerabilities. He completed his Ph.D. at the University of Washington and was a postdoctoral fellow in the Machine Learning Department at Carnegie Mellon. Alan is the recipient of an NSF CAREER award and an Amazon Research Award.

Комментарии

Информация по комментариям в разработке