北京通用人工智能研究院BIGAI

科研成果

From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes

Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning

NEP: Autoregressive lmage Editing via Next EditingToken Prediction

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning

Falcon: Fast visuomotor policy via partial denoising