STVG-R1: Incentivizing Instance-Level Reasoning and Grounding in Videos via Reinforcement Learning Xiaowen Zhang, Zhi Gao, Licheng Jiao, Lingling Li, Qing Li ICLR 2026 下载 查看更多
MVR:Multi-view Video Reward Shaping for Reinforcement Learning Lirui Luo*Guoxi Zhang*, Hongming Xu, Yaodong Yang, Cong Fang†, Qing Li† ICLR 2026 下载 查看更多
MILR: Improving Multimodal Image Generation via Test-Time Latent Reasoning Yapeng Mi, Hengli Li, Yanpeng Zhao†, Chenxi Li, Huimin Wu, Xianjian Ma, Song-Chun Zhu, YingNian Wu, Qing Li† ICLR 2026 下载 查看更多
When Large Multimodal Models Confront Evolving Knowledge: Challenges and Explorations Kailin Jiang*, Yuntao Du*, Yukai Ding, Yuchen Ren, Ning Jiang, Zhi Gao, Zilong Zheng, Lei Liu†, Bin Li, Qing Li† ICLR 2026 下载 查看更多
TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials Bofei Zhang*, Zirui Shang*, Zhi Gao*, Wang Zhang, Rui Xie, Xiaojian Ma, Tao Yuan, Xinxiao Wu, Song-Chun Zhu, Qing Li† AAAI 2026 下载 查看更多
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes Tianxu Wang* , Zhuofan Zhang* , Ziyu Zhu , Yue Fan , Jing Xiong , Pengxiang Li , Xiaojian Ma , and Qing Li✉ NeurIPS Datasets and Benchmarks 2025 下载 查看更多