About Me

Hi! I'm Wei Xiao (肖巍). I will join Fudan University & Shanghai Innovation Institute as a PhD student advised by Li Zhang this fall. I am also collaborating with Yao (Mark) Mu. Previously, I gained valuable research experience in reinforcement learning and embodied intelligence at ScaleLab@SJTU and MiLab@Westlake. I got my B.Eng in Automation from Xiamen University advised by Qifeng Zhou.

My research goal is to build end-to-end general robotic system that surpasses human abilities. I think reinforcement learning and a scalable learning paradigm (e.g. World Model, Unsupervised/Self-supervised Learning) that do not mainly rely on human prior (e.g. teleoperation) are necessary for it.

I'm currently focusing on reinforcement learning of robotic foundation model (VLA) and interning at XPENG Robotics. Please drop me an email if you are interested in my research or just want to chat! Email: xiaowei2002103@foxmail.com

News

2026.6: 🎉 Our work ROVE is available in arxiv.
2026.5: 🎉 LIT is accepted by CASE 2026!
2026.1: 🎉 KORR and TrajBooster are accepted by ICRA2026, congrats to Jeffrey and Jiacheng！
2025.7: ⭐ A summary of VLA+RL - Awesome-VLA-RL - is available in Github.
2025.5: 🎉 Our works PORL and LIT are available in arxiv.
2024.11: 🎉 Homepage has been set up.
2024.09: 🎉 Our paper PT4Rec is accepted by ACML2024 and Machine Learning Journal.

Publications

Embodied AI & RL

ROVE: Unlocking Human Interventions for Humanoid Manipulation via Reinforcement Learning

Wei Xiao*, Weiliang Tang*, Yuying Ge†, Hui Zhou, Yao Mu, Li Zhang, Yixiao Ge
Preprint
arxiv / project

Efficient Online RL Fine-Tuning with Offline Pre-trained Policy Only

Wei Xiao*, Jiacheng Liu, Zifeng Zhuang, Runze Suo, Shangke Lyu†, Donglin Wang†
Preprint
arxiv

Learning Robotic Policy with Imagined Transition: Mitigating the Trade-off between Robustness and Optimality

Wei Xiao*, Shangke Lyu†, Zhefei Gong, Renjie Wang, Donglin Wang†
IEEE International Conference on Automation Science and Engineering (CASE), 2026
arxiv / project / video

Robust Online Residual Refinement via Koopman-Guided Dynamics Modeling

Zhefei Gong, Shangke Lyu, Pengxiang Ding, Wei Xiao, Donglin Wang†
IEEE International Conference on Robotics and Automation (ICRA), 2026
arxiv / project / code

TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning

Jiacheng Liu*, Pengxiang Ding*, Qihang Zhou, Yuxuan Wu, Da Huang, Zimian Peng, Wei Xiao, Weinan Zhang, Lixin Yang, Cewu Lu†, Donglin Wang†
IEEE International Conference on Robotics and Automation (ICRA), 2026
arxiv / project / code

Integrating Trajectory Optimization and Reinforcement Learning for Quadrupedal Jumping with Terrain-Adaptive Landing

Renjie Wang*, Shangke Lyu†, Xin Lang, Wei Xiao, Donglin Wang†
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025 (Oral)
arxiv

Recommender System

Continuous-Time Sequential Recommendation with State Space Models

Wei Xiao*, Huiying Wang*, Qifeng Zhou†, Qing Wang
Preprint
arxiv / code

PT4Rec: A Universal Prompt-Tuning Framework for Graph Contrastive Learning-Based Recommendations

Wei Xiao*, Qifeng Zhou†
Asian Conference on Machine Learning (ACML), 2024; Machine Learning journal
paper / code

Honors and Awards

Tencent Kaiwu Reinforcement Learning Competition

Team: 南强至善 — Wei Xiao*, Yifan Lin, Jinyang Lai, Huaming Xu, Zejie Jiang, Yunlong Liu†
Fourth Place (with Bonus ￥20,000) — 2023.12
leaderboard

The 17th National Smart Car Competition for University Students

Team: 南强至善 — Wei Xiao*, Tianhao Hu, Yuhang Liu, Jincai Luo†
The Second Prize in South Region — 2022.07
blog / video1 / video2

The 13th Mathorcup Mathematical Modelling Competition, Third prize.
Huawei Software Elite Challenge, Third Prize.
National Mathematical Modelling Competition for College Students, Second Prize in Fujian Province.
National Algorithm Competition for College Students, Excellence Award.
and so on.

Academic Service

Conference Reviewer: IEEE International Conference on Robotics and Automation (ICRA), Conference on Robot Learning (CoRL).

About Me

News

Publications

Embodied AI & RL

ROVE: Unlocking Human Interventions for Humanoid Manipulation via Reinforcement Learning

Efficient Online RL Fine-Tuning with Offline Pre-trained Policy Only

Learning Robotic Policy with Imagined Transition: Mitigating the Trade-off between Robustness and Optimality

Robust Online Residual Refinement via Koopman-Guided Dynamics Modeling

TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning

Integrating Trajectory Optimization and Reinforcement Learning for Quadrupedal Jumping with Terrain-Adaptive Landing

Recommender System

Continuous-Time Sequential Recommendation with State Space Models

PT4Rec: A Universal Prompt-Tuning Framework for Graph Contrastive Learning-Based Recommendations

Honors and Awards

Tencent Kaiwu Reinforcement Learning Competition

The 17th National Smart Car Competition for University Students

Academic Service

Visitors

Wei Xiao

Error

About Me

News

Publications

Embodied AI & RL

ROVE: Unlocking Human Interventions for Humanoid Manipulation via Reinforcement Learning

Efficient Online RL Fine-Tuning with Offline Pre-trained Policy Only

Learning Robotic Policy with Imagined Transition: Mitigating the Trade-off between Robustness and Optimality

Robust Online Residual Refinement via Koopman-Guided Dynamics Modeling

TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning

Integrating Trajectory Optimization and Reinforcement Learning for Quadrupedal Jumping with Terrain-Adaptive Landing

Recommender System

Continuous-Time Sequential Recommendation with State Space Models

PT4Rec: A Universal Prompt-Tuning Framework for Graph Contrastive Learning-Based Recommendations

Honors and Awards

Tencent Kaiwu Reinforcement Learning Competition

The 17th National Smart Car Competition for University Students

Academic Service

Visitors

Templates:

Error