🤖 Embodied Intelligence Research Gallery

A comprehensive showcase of my research projects in embodied intelligence, including embodied navigation, foundation models, robotic manipulation, and vision-language-action systems.
Click each project to explore publications, code repositories, and detailed information.

🧠 Embodied Foundation Models

MiMo-Embodied

Technical Report 2025

X-Embodied Foundation Model for general-purpose embodied intelligence, enabling robots to understand, navigate, and interact with the world.

Paper Code

RoboBrain 2.0

Technical Report 2025

Large-scale embodied foundation model developed by BAAI, integrating vision, language, and action for comprehensive robotic intelligence.

Paper Code

NavA³

Under Review 2025

Understanding Any Instruction, Navigating Anywhere, Finding Anything. A unified framework for versatile navigation tasks.

Paper Code

MapNav

ACL 2025 2025

A novel memory representation via annotated semantic maps for VLM-based vision-and-language navigation.

Paper Code

SpatialNav

AAAI 2026 2026

What You See is What You Reach: Towards Spatial Navigation with High-Level Human Instructions.

Paper Code

SpatialSky

Under Review 2025

Is your VLM Sky-Ready? A Comprehensive Spatial Intelligence Benchmark for UAV Navigation.

Paper Code

🤲 Robotic Manipulation & Affordance

RoboAfford++

🏆 Best Paper 2025

A Generative AI-Enhanced Dataset for Multimodal Affordance Learning in Robotic Manipulation and Navigation.

Paper Code

RoboAfford

ACM MM 2025 2025

A dataset and benchmark for enhancing object and spatial affordance learning in robot manipulation.

Paper Code

🔒 Security & Safety

Typographic Visual Prompts

🏆 Best Student Paper 2025

Exploring typographic visual prompts injection threats in cross-modality generation models.

Paper Code

Research Highlights

Our research in embodied intelligence focuses on:

Foundation Models: Developing large-scale models that unify vision, language, and action for general-purpose robotic intelligence
Navigation: Enabling robots to navigate complex environments using natural language instructions and visual understanding
Manipulation: Learning object affordances and manipulation skills for real-world robotic tasks
Zero-Shot Learning: Building systems that can generalize to new tasks and environments without task-specific training

For more details about specific projects or collaboration opportunities, please contact me or visit the Publications page.

🤖 Embodied Intelligence Research Gallery

🧠 Embodied Foundation Models

MiMo-Embodied

RoboBrain 2.0

🗺️ Embodied Navigation

NavA³

MapNav

SpatialNav

SpatialSky

🤲 Robotic Manipulation & Affordance

RoboAfford++

RoboAfford

🔒 Security & Safety

Typographic Visual Prompts

Research Highlights