Mechanistic Interpretability for AI Alignment with Callum McDougall

Описание к видео Mechanistic Interpretability for AI Alignment with Callum McDougall

Dive into the fascinating world of mechanistic interpretability with Callum McDougall. Explore real-world case studies from indirect object identification to steering vectors, and discover how reverse-engineering AI models can contribute to alignment. Learn practical approaches to making AI systems less black-box-y through causal interventions and circuit analysis.

See the hackathon and sign up here: https://www.apartresearch.com/event/r.... Join us online for the live Q&A and project presentations at https://discord.gg/4HtDthHe?event=130...

Join future hackathons at https://apartresearch.com/sprints.

Комментарии

Информация по комментариям в разработке