AI Seminar Series: Marlos C. Machado - Autonomous nav of stratospheric balloons using RL (Jan 22)

Описание к видео AI Seminar Series: Marlos C. Machado - Autonomous nav of stratospheric balloons using RL (Jan 22)

Marlos C. Machado presents "Autonomous navigation of stratospheric balloons using reinforcement learning" at the AI Seminar (January 22, 2021).

The Artificial Intelligence (AI) Seminar is a weekly meeting at the University of Alberta where researchers interested in AI can share their research. Presenters include both local speakers from the University of Alberta and visitors from other institutions. Topics related in any way to artificial intelligence, from foundational theoretical work to innovative applications of AI techniques to new fields and problems, are explored.

Bio:
Marlos C. Machado is a research scientist at DeepMind Alberta. His research interests lie broadly in artificial Intelligence and particularly focus on reinforcement learning. He received his B.Sc. and M.Sc. from Universidade Federal de Minas Gerais, in Brazil, and his Ph.D. from the University of Alberta, where he introduced the idea of temporally-extended exploration through options. He was a research scientist at Google Brain from 2019 to 2021, during which time he made major contributions to reinforcement learning, in particular the introduction of an operator view of policy gradient methods and the application of deep reinforcement learning to control Loon’s stratospheric balloons. Marlos C. Machado is also an adjunct professor at the University of Alberta.

Abstract:
Efficiently navigating a superpressure balloon in the stratosphere requires the integration of a multitude of cues, such as wind speed and solar elevation, and the process is complicated by forecast errors and sparse wind measurements. Coupled with the need to make decisions in real time, these factors rule out the use of conventional control techniques. This talk describes the use of reinforcement learning to create a high-performing flight controller for Loon superpressure balloons. Our algorithm uses data augmentation and a self-correcting design to overcome the key technical challenge of reinforcement learning from imperfect data, which has proved to be a major obstacle to its application to physical systems. We deployed our controller to station Loon balloons at multiple locations across the globe, including a 39-day controlled experiment over the Pacific Ocean. Analyses show that the controller outperforms Loon’s previous algorithm and is robust to the natural diversity in stratospheric winds. These results demonstrate that reinforcement learning is an effective solution to real-world autonomous control problems in which neither conventional methods nor human intervention suffice, offering clues about what may be needed to create artificially intelligent agents that continuously interact with real, dynamic environments.

Комментарии

Информация по комментариям в разработке