Hosted September 11th 2025 at UWaterloo. Event link: https://luma.com/9i3ogts7
Papers covered:
1. Physical Intelligence, 2024, Pi0: A Vision-Language-Action Flow Model for General Robot Control by Steven Gong
2. TRI LBM Team, 2025, A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation by Krish Mehta
Additional Reading List
Brohan, et al., 2022, RT-1: Robotics Transformer for Real-World Control at Scale
Brohan, et al., 2023, RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Open X-Embodiment Collaboration, 2023, Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Chi, et al. 2023, Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Liu, et al., 2024, RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Etukuru, et al., 2024, Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments
Kim, et al. 2024, OpenVLA: An Open-Source Vision-Language-Action Model
Cheang, et al., 2024, GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation
Octo Model Team, 2024, Octo: An Open-Source Generalist Robot Policy
Fang, et al., 2025, Robix: A Unified Model for Robot Interaction, Reasoning and Planning
NVIDIA, 2025, GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Yang, et al., 2025, FP3: A 3D Foundation Policy for Robotic Manipulation
Lee, et al. 2025, MolmoAct: Action Reasoning Models that can Reason in Space
Информация по комментариям в разработке