From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes Tianxu Wang* , Zhuofan Zhang* , Ziyu Zhu , Yue Fan , Jing Xiong , Pengxiang Li , Xiaojian Ma , and Qing Li✉ NeurIPS Datasets and Benchmarks 2025 下载 查看更多
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning Pengxiang Li* , Zhi Gao* , Bofei Zhang , Yapeng Mi , Xiaojian Ma , Chenrui Shi , Tao Yuan , Yuwei Wu✉ , Yunde Jia , Song-Chun Zhu , and Qing Li✉ NeurIPS 2025 下载 查看更多
NEP: Autoregressive lmage Editing via Next EditingToken Prediction Huimin Wu, Xiaojian Ma, Haozhe Zhao, Yanpeng Zhao, Qing Li† NeurIPS 2025 下载 查看更多
Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding Yue Fan* , Xiaojian Ma* ✉, Rongpeng Su , Jun Guo , Rujie Wu , Xi Chen , and Qing Li✉ ICCV 2025 下载 查看更多
CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning Fanxu Meng*, Pingzhi Tang, Fan jiang, Muhan Zhang✉ ICML 2025 下载 查看更多
Falcon: Fast visuomotor policy via partial denoising Haojun Chen, Minghao Liu, Xiaojian Ma, Zailin Ma, Huimin Wu, Chengdong Ma, Yuanpei Chen, Yifan Zhong, Mingzhi Wang, Qing Li✉, Yaodong Yang✉ ICML 2025 下载 查看更多