科研成果

Constrained Update Projection Approach to Safe Policy Optimization

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning

MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control

TarGF: Learning Target Gradient Field for Object Rearrangement

LIGS：Learning Intrinstic-reward Generation Selection for Multi-Agent Learning

ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind