科研成果

Mind the Gap: The Divergence Between Human and LLM-Generated Tasks

TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials

Reasoning with Exploration: An Entropy Perspective

ADAPT: Adaptive Decentralized Architecture with Perception-aligned Training for Structural Generalization in Multi-Agent RL

DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning