Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL Fangwei Zhong*†, Kui Wu, Hai Ci, Churan Wang, Hao Chen ECCV 2024 下载 查看更多
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields Yu Liu*, Baoxiong Jia*, Yixin Chen, and Siyuan Huang† ECCV 2024 下载 查看更多
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions Jie Yang⋆ , Xuesong Niu⋆, Nan Jiang⋆, Ruimao Zhang†, and Siyuan Huang† ECCV 2024 下载 查看更多
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding Yue Fan*, Xiaojian Ma*†, Rujie Wu, Yuntao Du, Jiaqi Li, Zhi Gao, Qing Li ECCV 2024 下载 查看更多
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding Baoxiong Jia*, Yixin Chen˚*, Huangyue Yu, Yan Wang, Xuesong Niu, Tengyu Liu, Qing Li, Siyuan Huang ECCV 2024 下载 查看更多
Unifying 3D Vision-Language Understanding via Promptable Queries Ziyu Zhu†⋆, Zhuofan Zhang, Baoxiong Jia, Xiaojian Ma, Zhidong Deng†, Xuesong Niu, Siyuan Huang†, Yixin Chen, Qing Li† ECCV 2024 下载 查看更多