Overview
Complex long-horizon and sparse reward robotics tasks are challenging for scaling end-to-end learning methods. By contrast, planning approaches have shown great potential to handle such complex tasks effectively. One of the major criticisms of planning-based approaches has been the lack of availability of accurate world models (aka abstractions) to utilize.
There has been a renewed interest in using learning-based approaches to learn symbolic representations that support planning. However, this research is often fragmented into disjoint sub-communities such as task and motion planning, reinforcement learning (hierarchical, model-based), planning with formal logic, planning with natural language (language models), and neuro-symbolic AI. This workshop aims to create a common forum to share insights, discuss key questions, and chart a path forward via abstraction.
We aim to facilitate this bridge-building in two ways: (1) a diverse selection of papers and invited speakers; and (2) a highly interactive workshop. Concretely, the workshop will highlight approaches that use different learning methods, mainly to learn symbolic and composable representations of world models. Key questions for discussion include:
-
What is the right objective for abstraction learning for robotic planning? To what extent should we consider factors such as soundness, completeness, target planner and planning efficiency, and task distribution?
-
How can abstraction-based systems in the role of data collectors facilitate contemporary learning methods such as imitation learning? Can we develop insights into what can be explicit and implicit priors that would allow efficient long-horizon task learning, and can we derive them from abstraction-based planning systems?
-
What level of abstraction is needed for it to be effective? How general-purpose or specific do these abstractions have to be for long-term autonomy? Do learned abstractions need to be hierarchical or at a single level?
-
How can existing pre-trained foundational models (large language models (LLMs) and vision-language models (VLMs)) be utilized for learning symbolic abstractions while ensuring guarantees about soundness and correctness? How can we also incorporate cost and plan quality? How can we incorporate cost-sensitive reasoning into vision-language and large model-based abstractions to produce efficient and executable plans for embodied agents?
-
How can learned abstractions enable safer and decidable outcomes for robot skills learned in the form of robot action foundation models such as OpenVLA, Pi, and LBMs? Can we learn symbolic models for such action foundation models that enable off-the-shelf planners to be used in open-world settings?
- When, where, and from what data should abstractions be learned? Should they be learned as priors in the robot factory, using expert demonstrations, or in the “wild” from interaction with humans or the world? What are the trade-offs between top-down operator construction (e.g., via symbolic abstraction or language) and bottom-up operator discovery (e.g., through exploration or policy learning), and how do they affect generalization and planning efficiency?
Following the success of the previous offerings of the workshop at CoRL 2023 and CoRL 2024, we propose a third iteration of the workshop at CoRL 2025. Specifically, the previous iterations of the workshop received a total of 26 and 55 submissions, respectively. These submissions highlighted important characteristics and challenges of learning abstractions while showcasing the abstraction capabilities of pre-trained LLMs and VLMs. This iteration of the workshop would have a stronger emphasis on works that highlight the capabilities of using high-dimensional data and pre-trained models for learning sound and complete abstractions that enable cost-effective and reliable planning.
Areas of Interest
We solicit papers of the following topics:
- Learning generalizable and composable representations for robot planning
- Learning for task and motion planning (TAMP)
- Learning state abstractions and action abstractions
- Natural language as an abstraction for learning-based planning
- Learning other knowledge representations for planning
- Learning for hierarchical planning
- Learning for LTL-based planning
- Neuro-symbolic approaches for task and motion planning
- Hierarchical reinforcement learning for robotics
Submission Guidelines
We solicit workshop paper submissions relevant to the above call of the following types:
- Long papers - up to 8 pages plus unlimited references / appendices
- Short papers - up to 4 pages plus unlimited references / appendices
Please format submissions in CoRL or IEEE conference (ICRA or IROS) styles. Submissions do not need to be anonymized. To authors submitting papers rejected from other conferences: please ensure that comments given by the reviewers are addressed prior to submission.
Note: Please feel free to submit work under review or accepted for presentation at other workshops and/or conferences as we will not require copyright transfer.
We are now accepting submissions through our OpenReview portal
Note: The CoRL workshop organizers have requested that we do not accept submissions that are already accepted at the main CoRL conference. We kindly ask authors to respect this policy when submitting to our workshop.
Important Dates
| Paper Submission Deadline |
Aug 20, 2025 (Early) |
| Sep 5, 2025 (Late) |
| Author Notification
|
Sep 5, 2025
|
| Camera-ready Version Due
|
Sept 22, 2025
|
| Workshop
|
Sept 27, 2025
|
We offer two submission deadlines to accommodate different planning needs: an early deadline and a late deadline. The early deadline is intended for authors who may require quicker notification for travel, visa, or funding arrangements. Submissions received by the early deadline will be reviewed promptly, and notifications will be sent within approximately two weeks of submission. Authors who do not require early feedback are welcome to submit by the late deadline.
Schedule
Session 1 |
| 1:30 PM - 1:35 PM |
Welcome Remarks |
| 1:35 PM - 2:05 PM |
Invited Talk: Danfei Xu |
| 2:05 PM - 2:35 PM |
Invited Talk: Emre Ugur |
| 2:35 PM - 3:00 PM |
Poster Lightning Talks
|
| 3:00 PM - 3:30 PM |
Coffee Break + Poster Session |
| 3:30 PM - 3:55 PM |
Poster Session
|
| 3:55 PM - 4:00 PM |
Best Paper Award
|
| 4:05 PM - 4:35 PM |
Invited Talk: Rohan Paul |
Invited Speakers
Georgia Institute of Technology
USA
Bogazici University Turkey
IIT Delhi India
Danfei Xu: Generative Task and Motion Planning
Long-horizon planning is fundamental to our ability to solve complex physical problems, from using tools to cooking dinners. Despite recent progress in commonsense-rich foundation models, the ability to do the same is still lacking in robots. In this talk, I will present a body of work that aims to transform Task and Motion Planning—one of the most powerful computational frameworks in Manipulation Planning—into a fully generative model framework, enabling compositional generalization in a predominantly data-driven approach. I will explore how to chain together modular diffusion-based skills through iterative forward-backward denoising, how to formulate TAMP as a factor graph problem with generative models serving as learned constraints for planning, and how to integrate task and motion planning within a single generative process.
Emre Ugur: DeepSym: A Neuro-symbolic Approach for Symbol Emergence and Planning
Abstract reasoning are among the most essential characteristics of high-level intelligence that distinguish humans from other animals. If the robots can achieve abstract reasoning on their own, they can perform new tasks in completely novel environments by updating their cognitive skills or by discovering new symbols and rules. Towards this goal, we propose a novel general framework, DeepSym, which discovers interaction grounded, discrete object, action and effect categories and builds probabilistic rules for non-trivial action planning. In DeepSym, our robot interacts with objects using an initial action repertoire and observes the effects it can create in the environment. To form interaction-grounded object, action, effect, and relational categories, we employ a binary bottleneck layer in a predictive, deep encoder-decoder network that takes the image of the scene and the action parameters as input and generates the resulting effects in the scene in pixel coordinates. The knowledge represented by the neural network is distilled into rules and represented in the Probabilistic Planning Domain Definition Language (PPDDL), allowing off-the-shelf planners to operate on the knowledge extracted from the sensorimotor experience of the robot.
Rohan Paul: Towards Generalisation in Robot Instruction Following via Abstractions
Robot instruction following entails translating the human’s intent expressed as high-level language descriptions to contextually ground plans for the robot to execute. The problem is challenging due to the abstract nature of human instructions, large variability of possible tasks and long-horizon spatio-temporal reasoning required for plan synthesis. This talk will discuss recent work in acquiring grounded abstractions for from human annotated demonstrations, leveraging sub-goals in long-horizon skill learning and using VLMs as a source for instruction-specific symbolic knowledge. Overall, we will hope to uncover the role of abstractions in aiding long-range reasoning as well as bridging the human’s intent and robot’s world model.
Accepted Papers (Read them here)
- From Skills to TAMP: Learning Portable Symbolic Representations for Task and Motion Planning
Naman Shah, Benned Hedegaard, Yichen Wei, Ziyi Yang, Alper Ahmetoglu, Stefanie Tellex, George Konidaris
- Learning to Plan & Schedule with Reinforcement-Learned Bimanual Robot Skills
Weikang Wan, Fabio Ramos, Xuning Yang, Caelan Reed Garrett
- SymSkill: Symbol and Skill Co-Invention for Data-Efficient and Real-Time Long-Horizon Manipulation
Yifei Simon Shao, Yuchen Zheng, Sunan Sun, Nadia Figueroa
- Preference-Based Long-Horizon Robotic Stacking with Multimodal Large Language Models
Wanming Yu, Adrian Röfer, Abhinav Valada, Sethu Vijayakumar
- Zero-order optimization with contact priors for locomotion
Victor Dhédin, Majid Khadiv
- Uncertainty-Aware Planning with Generative World- and Language-Models via Monte Carlo Tree Search
Magà Dalmau-Moreno, Nestor Garcia, Vicenç Gomez
- Learning Sound Symbolic Abstractions from VLMs for Efficient Task and Motion Planning on CALVIN
Bayron Jossue Serrano Mena
- Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Shizhe Chen, Ricardo Garcia Pinel, Paul Pacaud, Cordelia Schmid
- SEAL: Safe and Efficient Abstraction Learning for Robotic Planning
Akhil R Kurup, Ravi Prakash
- Diffusion-Guided Q-Learning for Offline RL: Adaptive Revaluation for Long-Horizon Decision Making
Jaehyun Park, Yunho Kim, Sejin Kim, Byung-Jun Lee, Sundong Kim
- SPUR: Scaling Reward Learning from Human Demonstrations
Anthony Liang, Yigit Korkmaz, Jiahui Zhang, Jesse Zhang, Abrar Anwar, Sidhant Kaushik, Yufei Wang, Yu Xiang, David Held, Dieter Fox, Abhishek Gupta, Stephen Tu, Erdem Biyik
- Hierarchical Vision-Language-Action Policies for Global Reasoning in Assembly Tasks
Moritz Hesche, Moritz Reuss, Marc Forstenhäusler, Rudolf Lioutikov
- BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
Hongyi Zhou, Weiran Liao, Xi Huang, Yucheng Tang, Fabian Otto, Xiaogang Jia, Xinkai Jiang, Simon Hilber, Ge Li, Qian Wang, Ömer Erdinç Yağmurlu, Nils Blank, Moritz Reuss, Rudolf Lioutikov
- Unleashing Humanoid Reaching Potential via Real-world-Ready Skill Space
Zhikai Zhang, Chao Chen, Han Xue, Jilong Wang, Sikai Liang, Yun Liu, Zongzhang Zhang, He Wang, Li Yi
- Learning 3D Scene Analogies with Neural Contextual Scene Maps
Junho Kim, Gwangtak Bae, Eun Sun Lee, Young Min Kim
- Improving Pre-Trained Vision-Language-Action Policies with Model-Based Search
Cyrus Neary, Omar G. Younis, Artur Kuramshin, Ozgur Aslan, Glen Berseth
- If You Can Make an Omelette, Can You Crack an Egg? Probing Zero-Shot Subtask Generalization in Vision-Language-Action Models
Grigorii Guz, Giuseppe Carenini, Mathias Lécuyer, Michiel van de Panne, Vered Shwartz
- From Code to Action: Hierarchical Learning of Diffusion-VLM Policies
Markus Peschl, Pietro Mazzaglia, Daniel Dijkman
- HELIOS: Hierarchical Exploration for Language-grounded Interaction in Open Scenes
Katrina Ashton, Chahyon Ku, Shrey Shah, Wen Jiang, Kostas Daniilidis, Bernadette Bucher
- Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations
Shivansh Patel, Shraddhaa Mohan, Hanlin Mai, Unnat Jain, Svetlana Lazebnik, Yunzhu Li
- Fac-TDMPC: Learning an Efficient and Robust Factored World Model for Robot Planning
Yuan Zhang, Jianhong Wang, Joschka Boedecker
- Model Predictive Adversarial Imitation Learning for Planning from Observation
Tyler Han, Yanda Bao, Bhaumik Mehta, Gabriel Guo, Anubhav Vishwakarma, Emily Kang, Sanghun Jung, Rosario Scalise, Jason Liren Zhou, Bryan Xu, Byron Boots
- Hybrid Thinking in Vision-Language-Action Models
Pietro Mazzaglia, Cansu Sancaktar, Markus Peschl, Daniel Dijkman
- Touch begins where vision ends: Generalizable policies for contact-rich manipulation
Zifan Zhao, Siddhant Haldar, Jinda Cui, Lerrel Pinto, Raunaq Bhirangi
Organizing Committees
AI2
USA
Princeton University
USA
George Mason University
USA
Gerogia Institute of Technology
USA
Stanford University
USA
Korea Advanced Institute of Science & Technology
South Korea
TU Darmstadt
Germany