科研成果

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

How to Synthesize Text Data without Model Collapse?

Lossless Acceleration of Ultra Long Sequence Generation

Are the Values of LLMs Structurally Aligned with Humans?A Causal Perspective

Look Both Ways and No Sink: Converting LLMs into Text Encoders without Training