科研成果

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor

Reinforced Query Reasoners for Reasoning-intensive Retrieval Tasks

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

Enhancing LLM-Based Social Bot via an Adversarial Learning Framework

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges