科研成果

Linking Process to Outcome: Conditonal Reward Modeling for LLM Reasoning

Aegis: Automated Error Generation and Identification for Multi-Agent Systems

ADAPT: Adaptive Decentralized Architecture with Perception-aligned Training for Structural Generalization in Multi-Agent RL

World Models Should Prioritize the Unification of Physical and Social Dynamics

Social World Model-Augmented Mechanism Design Policy Learning

Simulating Human-like Daily Activities with Desire-driven Autonomy