# arXiv Paper Digest - 2026-03-30

*Generated on 2026-03-30 10:17:09*

Found 10 relevant papers.

---

## 1. [Knowledge-Guided Manipulation Using Multi-Task Reinforcement Learning](http://arxiv.org/abs/2603.24083v1)

**Authors:** Aditya Narendra, Mukhammadrizo Maribjonov, Dmitry Makarov, Dmitry Yudin, Aleksandr Panov

**Published:** 2026-03-25

**Categories:** cs.RO, cs.AI, cs.LG

**Relevance Score:** 0.737
 (Keyword matches: 3)

**Topic:** Robotics Manipulation

**Abstract:** This paper introduces Knowledge Graph based Massively Multi-task Model-based Policy Optimization (KG-M3PO), a framework for multi-task robotic manipulation in partially observable settings that unifies Perception, Knowledge, and Policy. The method augments egocentric vision with an online 3D scene g...

[PDF](https://arxiv.org/pdf/2603.24083v1) | [arXiv](http://arxiv.org/abs/2603.24083v1)

---

## 2. [Meta-Adaptive Beam Search Planning for Transformer-Based Reinforcement Learning Control of UAVs with Overhead Manipulators under Flight Disturbances](http://arxiv.org/abs/2603.26612v1)

**Authors:** Hazim Alzorgan, Sayed Pedram Haeri Boroujeni, Abolfazl Razi

**Published:** 2026-03-27

**Categories:** cs.RO

**Relevance Score:** 0.647
 (Keyword matches: 2)

**Topic:** Robotics Manipulation

**Abstract:** Drones equipped with overhead manipulators offer unique capabilities for inspection, maintenance, and contact-based interaction. However, the motion of the drone and its manipulator is tightly linked, and even small attitude changes caused by wind or control imperfections shift the end-effector away...

[PDF](https://arxiv.org/pdf/2603.26612v1) | [arXiv](http://arxiv.org/abs/2603.26612v1)

---

## 3. [Closed-Loop Verbal Reinforcement Learning for Task-Level Robotic Planning](http://arxiv.org/abs/2603.22169v1)

**Authors:** Dmitrii Plotnikov, Iaroslav Kolomiets, Dmitrii Maliukov, Dmitrij Kosenkov, Daniia Zinniatullina, et al. (+8 more)

**Published:** 2026-03-23

**Categories:** cs.RO

**Relevance Score:** 0.604
 (Keyword matches: 2)

**Topic:** Robotics Manipulation

**Abstract:** We propose a new Verbal Reinforcement Learning (VRL) framework for interpretable task-level planning in mobile robotic systems operating under execution uncertainty. The framework follows a closed-loop architecture that enables iterative policy improvement through interaction with the physical envir...

[PDF](https://arxiv.org/pdf/2603.22169v1) | [arXiv](http://arxiv.org/abs/2603.22169v1)

---

## 4. [DiffusionAnything: End-to-End In-context Diffusion Learning for Unified Navigation and Pre-Grasp Motion](http://arxiv.org/abs/2603.26322v1)

**Authors:** Iana Zhura, Yara Mahmoud, Jeffrin Sam, Hung Khang Nguyen, Didar Seyidov, et al. (+2 more)

**Published:** 2026-03-27

**Categories:** cs.RO

**Relevance Score:** 0.602

**Topic:** Robotics Manipulation

**Abstract:** Efficiently predicting motion plans directly from vision remains a fundamental challenge in robotics, where planning typically requires explicit goal specification and task-specific design. Recent vision-language-action (VLA) models infer actions directly from visual input but demand massive computa...

[PDF](https://arxiv.org/pdf/2603.26322v1) | [arXiv](http://arxiv.org/abs/2603.26322v1)

---

## 5. [Toward Generalist Neural Motion Planners for Robotic Manipulators: Challenges and Opportunities](http://arxiv.org/abs/2603.24318v1)

**Authors:** Davood Soleymanzadeh, Ivan Lopez-Sanchez, Hao Su, Yunzhu Li, Xiao Liang, et al. (+1 more)

**Published:** 2026-03-25

**Categories:** cs.RO, cs.AI

**Relevance Score:** 0.600

**Topic:** Robotics Manipulation

**Abstract:** State-of-the-art generalist manipulation policies have enabled the deployment of robotic manipulators in unstructured human environments. However, these frameworks struggle in cluttered environments primarily because they utilize auxiliary modules for low-level motion planning and control. Motion pl...

[PDF](https://arxiv.org/pdf/2603.24318v1) | [arXiv](http://arxiv.org/abs/2603.24318v1)

---

## 6. [LaMP: Learning Vision-Language-Action Policies with 3D Scene Flow as Latent Motion Prior](http://arxiv.org/abs/2603.25399v1)

**Authors:** Xinkai Wang, Chenyi Wang, Yifu Xu, Mingzhe Ye, Fu-Cheng Zhang, et al. (+5 more)

**Published:** 2026-03-26

**Categories:** cs.CV, cs.RO

**Relevance Score:** 0.597

**Topic:** Robotics Manipulation

**Abstract:** We introduce \textbf{LaMP}, a dual-expert Vision-Language-Action framework that embeds dense 3D scene flow as a latent motion prior for robotic manipulation. Existing VLA models regress actions directly from 2D semantic visual features, forcing them to learn complex 3D physical interactions implicit...

[PDF](https://arxiv.org/pdf/2603.25399v1) | [arXiv](http://arxiv.org/abs/2603.25399v1)

---

## 7. [SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation](http://arxiv.org/abs/2603.22760v1)

**Authors:** Ruisen Tu, Arth Shukla, Sohyun Yoo, Xuanlin Li, Junxi Li, et al. (+3 more)

**Published:** 2026-03-24

**Categories:** cs.RO

**Relevance Score:** 0.591
 (Keyword matches: 1)

**Topic:** Robotics Manipulation

**Abstract:** Vision-Language-Action (VLA) models show promise for robotic control, yet performance in complex household environments remains sub-optimal. Mobile manipulation requires reasoning about global scene layout, fine-grained geometry, and high-dimensional continuous actions, making standard imitation lea...

[PDF](https://arxiv.org/pdf/2603.22760v1) | [arXiv](http://arxiv.org/abs/2603.22760v1)

---

## 8. [Persistent Robot World Models: Stabilizing Multi-Step Rollouts via Reinforcement Learning](http://arxiv.org/abs/2603.25685v1)

**Authors:** Jai Bardhan, Patrik Drozdik, Josef Sivic, Vladimir Petrik

**Published:** 2026-03-26

**Categories:** cs.RO, cs.CV

**Relevance Score:** 0.589
 (Keyword matches: 2)

**Topic:** Robotics Manipulation

**Abstract:** Action-conditioned robot world models generate future video frames of the manipulated scene given a robot action sequence, offering a promising alternative for simulating tasks that are difficult to model with traditional physics engines. However, these models are optimized for short-term prediction...

[PDF](https://arxiv.org/pdf/2603.25685v1) | [arXiv](http://arxiv.org/abs/2603.25685v1)

---

## 9. [DFM-VLA: Iterative Action Refinement for Robot Manipulation via Discrete Flow Matching](http://arxiv.org/abs/2603.26320v1)

**Authors:** Jiayi Chen, Wenxuan Song, Shuai Chen, Jingbo Wang, Zhijun Li, et al. (+1 more)

**Published:** 2026-03-27

**Categories:** cs.RO, cs.CV

**Relevance Score:** 0.589
 (Keyword matches: 1)

**Topic:** Robotics Manipulation

**Abstract:** Vision--Language--Action (VLA) models that encode actions using a discrete tokenization scheme are increasingly adopted for robotic manipulation, but existing decoding paradigms remain fundamentally limited. Whether actions are decoded sequentially by autoregressive VLAs or in parallel by discrete d...

[PDF](https://arxiv.org/pdf/2603.26320v1) | [arXiv](http://arxiv.org/abs/2603.26320v1)

---

## 10. [SoftMimicGen: A Data Generation System for Scalable Robot Learning in Deformable Object Manipulation](http://arxiv.org/abs/2603.25725v1)

**Authors:** Masoud Moghani, Mahdi Azizian, Animesh Garg, Yuke Zhu, Sean Huver, et al. (+1 more)

**Published:** 2026-03-26

**Categories:** cs.RO

**Relevance Score:** 0.585
 (Keyword matches: 1)

**Topic:** Robotics Manipulation

**Abstract:** Large-scale robot datasets have facilitated the learning of a wide range of robot manipulation skills, but these datasets remain difficult to collect and scale further, owing to the intractable amount of human time, effort, and cost required. Simulation and synthetic data generation have proven to b...

[PDF](https://arxiv.org/pdf/2603.25725v1) | [arXiv](http://arxiv.org/abs/2603.25725v1)

---