Lingfeng Zhang | 张凌峰

Tsinghua University Ph.D. Student.

zlf.png

Welcome to my homepage! 👋

I am currently a first year Ph.D. student at Tsinghua University, Guangdong, in the SSR group (supervised by Prof. Wenbo Ding and Prof. Xiaojun Liang).
My current research focuses on embodied intelligence and MLLMs, with a special emphasis on embodied navigation and embodied foundation models.

Before joining Tsinghua University, I received my M.Phil. in Microelectronics in 2025 from The Hong Kong University of Science and Technology (Guangzhou), supervised by Prof. Renjing Xu and Prof. Xinyu Chen. I obtained my B.Eng. in Electronics Information in 2023 from Beijing Institute of Technology, and conducted a research internship in 2023 at the Tsinghua University, under the supervision of Prof. Fangwen Yu and Prof. Luping Shi. I was also a research intern at BAAI, supervised by Dr. Xiaoshuai Hao.

My research interests include:

  • Embodied Intelligence
  • Embodied Foundation Model
  • Embodied Navigation
  • Embodied Manipulation
  • Vision-Language-Action (VLA) Model

I am always open to discussions and collaborations — feel free to reach out!

news

Feb 2026 One paper was accepted by CVPR 2026 (CCF-A).
Jan 2026 Two papers were accepted by ICRA 2026 (CCF-B).
Dec 2025 One paper was accepted by RAL 2025.
Nov 2025 Groundbreaking Announcement! The MiMo-Embodied: X-Embodied Foundation Model Technical Report is now available!
Nov 2025 One paper was accepted by AAAI 2026 (CCF-A).
Oct 2025 Our paper RoboAfford++: A Generative AI-Enhanced Dataset for Multimodal Affordance Learning in Robotic Manipulation and Navigation, has been awarded the Best Paper and Best Poster Award at IROS 2025 RoDGE Workshop
Oct 2025 Our team achieved excellent results at the IROS 2025 RoboSense Challenge, placing second in Track #2: Social Navigation and third in Track #4: Cross-Modal Drone Navigation!
Aug 2025 Our paper Exploring typographic visual prompts injection threats in cross-modality generation models, has been awarded the Best Student Paper Award at IJCAI Workshop on Deepfake Detection, Localization and Interpretability.
Aug 2025 I commenced my PhD journey at Tsinghua University! :sparkles: :smile:
Aug 2025 We won third place in ICCV EVQA-SnapUGC Challenge, with our model achieving the best single-modality performance.
Jul 2025 Two papers were accepted by ACM MM 2025 (CCF-A).
May 2025 One paper was accepted by ACL 2025 Main (CCF-A).
Mar 2025 I graduated from HKUST(GZ)!
Jan 2025 One paper was accepted by ICRA 2025 (CCF-B).
Nov 2024 I started my internship in Beijing Academy of Artificial Intelligence (BAAI), supervised by Dr. Xiaoshuai Hao
Aug 2024 One paper was accepted by ECCV 2024 (CCF-B).
Jun 2024 One paper was accepted by IROS 2024 (CCF-C).
Jun 2023 I graduated from BIT!

selected publications

  1. Technical Report
    mimo.png
    MiMo-Embodied: X-Embodied Foundation Model Technical Report
    Xiaoshuai Hao*, Lei Zhou*, Zhijian Huang*, Zhiwen Hou*, Yingbo Tang*, Lingfeng Zhang*, Guang Li*, Zheng Lu*, Shuhuai Ren, Xianhui Meng, and others
    arXiv preprint arXiv:2511.16518, 2025
  2. CVPR
    sky.png
    Is your VLM Sky-Ready? A Comprehensive Spatial Intelligence Benchmark for UAV Navigation
    Lingfeng Zhang*, Yuchen Zhang*, Hongsheng Li, Haoxiang Fu, Yingbo Tang, Hangjun Ye, Long Chen, Xiaojun Liang, Xiaoshuai Hao, and Wenbo Ding
    arXiv preprint arXiv:2511.13269, 2025
  3. AAAI
    spatialnav.png
    What You See is What You Reach: Towards Spatial Navigation with High-Level Human Instructions
    Lingfeng Zhang*, Haoxiang Fu*, Xiaoshuai Hao, Shuyi Zhang, Qiang Zhang, Rui Liu, Long Chen, and Wenbo Ding
    2026
  4. 🏆Best Paper Award
    roboafford++.png
    RoboAfford++: A Generative AI-Enhanced Dataset for Multimodal Affordance Learning in Robotic Manipulation and Navigation
    Xiaoshuai Hao, Yingbo Tang, Lingfeng Zhang, Yanbiao Ma, Yunfeng Diao, Ziyu Jia, Wenbo Ding, Hangjun Ye, and Long Chen
    arXiv preprint arXiv:2511.12436, 2025
  5. 🏆Best Student Paper Award
    ijcai.png
    Exploring typographic visual prompts injection threats in cross-modality generation models
    Hao Cheng, Erjia Xiao, Yichi Wang, Lingfeng Zhang, Qiang Zhang, Jiahang Cao, Kaidi Xu, Mengshu Sun, Xiaoshuai Hao, Jindong Gu, and others
    arXiv preprint arXiv:2503.11519, 2025
  6. ACM MM
    roboafford.png
    Roboafford: A dataset and benchmark for enhancing object and spatial affordance learning in robot manipulation
    Yingbo Tang*, Lingfeng Zhang*, Shuyi Zhang, Yinuo Zhao, and Xiaoshuai Hao
    In Proceedings of the 33rd ACM International Conference on Multimedia, 2025
  7. Under Review
    nava.png
    NavA^3: Understanding Any Instruction, Navigating Anywhere, Finding Anything
    Lingfeng Zhang*, Xiaoshuai Hao*, Yingbo Tang, Haoxiang Fu, Xinyu Zheng, Pengwei Wang, Zhongyuan Wang, Wenbo Ding, and Shanghang Zhang
    arXiv preprint arXiv:2508.04598, 2025
  8. Technical Report
    robobrain.png
    Robobrain 2.0 technical report
    BAAI RoboBrain Team, Mingyu Cao, Huajie Tan, Yuheng Ji, Xiansheng Chen, Minglan Lin, Zhiyu Li, Zhou Cao, Pengwei Wang, Enshen Zhou, and others
    arXiv preprint arXiv:2507.02029, 2025
  9. ACL
    mapnav.png
    Mapnav: A novel memory representation via annotated semantic maps for vlm-based vision-and-language navigation
    Lingfeng Zhang*, Xiaoshuai Hao*, Qinwen Xu, Qiang Zhang, Xinyao Zhang, Pengwei Wang, Jing Zhang, Zhongyuan Wang, Shanghang Zhang, and Renjing Xu
    In The 63rd Annual Meeting of the Association for Computational Linguistics, 2025