Extreme Self-Preference in Language Models
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
DeepSearch Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo
Chain-in-Tree Back to Sequential Reasoning in LLM Tree Search
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
'Too much alignment; not enough culture' Re-balancing cultural alignment practices in LLMs
The NazoNazo Benchmark: A Cost-Effective and Extensible Test of Insight-Based Reasoning in LLMs
Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness
The Thinking Spectrum An Emperical Study of Tunable Reasoning in LLMs through Model Merging
Successful Misunderstandings Learning to Coordinate Without Being Understood
Reasoning or Retrieval A Study of Answer Attribution on Large Reasoning Models
On the Self-awareness of Large Reasoning Models' Capability Boundaries
ELHPlan Efficient Long-Horizon Task Planning for Multi-Agent Collaboration
Cogito, Ergo Ludo An Agent that Learns to Play by Reasoning and Planning
MIXRAG Mixture-of-Experts Retrieval-Augmented Generation for Textual Graph Understanding and Questio
Learning More with Less A Dynamic Dual-Level Down-Sampling Framework for Efficient Policy Optimizati
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research
Learn the Ropes, Then Trust the Wins Self-imitation with Progressive Exploration for Agentic Reinfor
Fairness-Aware Reinforcement Learning (FAReL) A Framework for Transparent and Balanced Sequential De
Do LLM Agents Know How to Ground, Recover, and Assess A Benchmark for Epistemic Competence in Inform
Benefits and Pitfalls of Reinforcement Learning for Language Model Planning A Theoretical Perspectiv
The Rogue Scalpel Activation Steering Compromises LLM Safety
The Emergence of Altruism in Large-Language-Model Agents Society
Rationality Check! Benchmarking the Rationality of Large Language Models
Scaling Laws for Neural Language Models
Retrieval-of-Thought Efficient Reasoning via Reusing Thoughts
Learning to Reason with Mixture of Tokens
Understanding Thinking Process of Reasoning Models:A Perspective from Schoenfeld's Episode Theory
LANCE Low Rank Activation Compression for Efficient On-Device Continual Learning