AI Agents: Substance or Snake Oil with Arvind Narayanan - 704

Описание к видео AI Agents: Substance or Snake Oil with Arvind Narayanan - 704

Today, we're joined by Arvind Narayanan, professor of Computer Science at Princeton University to discuss his recent works, AI Agents That Matter - https://arxiv.org/abs/2407.01502 and AI Snake Oil - https://www.aisnakeoil.com/. In “AI Agents That Matter”, we explore the range of agentic behaviors, the challenges in benchmarking agents, and the ‘capability and reliability gap’, which creates risks when deploying AI agents in real-world applications. We also discuss the importance of verifiers as a technique for safeguarding agent behavior. We then dig into the AI Snake Oil book, which uncovers examples of problematic and overhyped claims in AI. Arvind shares various use cases of failed applications of AI, outlines a taxonomy of AI risks, and shares his insights on AI’s catastrophic risks. Additionally, we also touched on different approaches to LLM-based reasoning, his views on tech policy and regulation, and his work on CORE-Bench - https://arxiv.org/abs/2409.11363, a benchmark designed to measure AI agents' accuracy in computational reproducibility tasks.

🎧 / 🎥 Listen or watch the full episode on our page: https://twimlai.com/go/704.

🔔 Subscribe to our channel for more great content just like this: https://youtube.com/twimlai?sub_confi...


🗣️ CONNECT WITH US!
===============================
Subscribe to the TWIML AI Podcast: https://twimlai.com/podcast/twimlai/
Follow us on Twitter:   / twimlai  
Follow us on LinkedIn:   / twimlai  
Join our Slack Community: https://twimlai.com/community/
Subscribe to our newsletter: https://twimlai.com/newsletter/
Want to get in touch? Send us a message: https://twimlai.com/contact/


📖 CHAPTERS
===============================
00:00 - Introduction
2:40 - Motivations for research agenda
6:03 - Challenges in AI agents
10:59 - Constrained environments for agents
16:51 - Factors in defining agents
19:26 - AI Agents that Matter
23:03 - CORE-Bench
30:54 - Approaches to LLM-based reasoning
35:41 - Snake Oil
37:04 - Problematic and overhyped claims
40:06 - Examples of dangerous applications of AI
43:57 - Tech policies and legislations
49:28 - AI catastrophic risks


🔗 LINKS & RESOURCES
===============================
AI Agents That Matter - https://arxiv.org/abs/2407.01502
AI Snake Oil - https://www.aisnakeoil.com/
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark - https://arxiv.org/abs/2409.11363
On the Societal Impact of Open Foundation Models - https://crfm.stanford.edu/open-fms/pa...


📸 Camera: https://amzn.to/3TQ3zsg
🎙️Microphone: https://amzn.to/3t5zXeV
🚦Lights: https://amzn.to/3TQlX49
🎛️ Audio Interface: https://amzn.to/3TVFAIq
🎚️ Stream Deck: https://amzn.to/3zzm7F5

Комментарии

Информация по комментариям в разработке