Papers
Topics
Authors
Recent
Search
2000 character limit reached

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Published 20 Mar 2025 in cs.CL | (2503.16419v3)

Abstract: LLMs have demonstrated remarkable capabilities in complex tasks. Recent advancements in Large Reasoning Models (LRMs), such as OpenAI o1 and DeepSeek-R1, have further improved performance in System-2 reasoning domains like mathematics and programming by harnessing supervised fine-tuning (SFT) and reinforcement learning (RL) techniques to enhance the Chain-of-Thought (CoT) reasoning. However, while longer CoT reasoning sequences improve performance, they also introduce significant computational overhead due to verbose and redundant outputs, known as the "overthinking phenomenon". In this paper, we provide the first structured survey to systematically investigate and explore the current progress toward achieving efficient reasoning in LLMs. Overall, relying on the inherent mechanism of LLMs, we categorize existing works into several key directions: (1) model-based efficient reasoning, which considers optimizing full-length reasoning models into more concise reasoning models or directly training efficient reasoning models; (2) reasoning output-based efficient reasoning, which aims to dynamically reduce reasoning steps and length during inference; (3) input prompts-based efficient reasoning, which seeks to enhance reasoning efficiency based on input prompt properties such as difficulty or length control. Additionally, we introduce the use of efficient data for training reasoning models, explore the reasoning capabilities of small LLMs, and discuss evaluation methods and benchmarking.

Summary

  • The paper introduces a categorization of efficient reasoning techniques in LLMs to mitigate overthinking and lower inference costs.
  • It evaluates model-based, reasoning output-based, and input prompt-based methods, emphasizing RL reward designs and CoT compression.
  • The survey highlights practical applications and benchmarks that balance computational efficiency with reasoning accuracy in real-world deployments.

This paper, "Stop Overthinking: A Survey on Efficient Reasoning for LLMs" (2503.16419), provides a comprehensive overview of techniques aimed at making the reasoning processes of LLMs more computationally efficient without sacrificing accuracy. It addresses the "overthinking phenomenon" where models like OpenAI o1 and DeepSeek-R1, while capable of complex reasoning using Chain-of-Thought (CoT), generate excessively long and redundant reasoning steps, leading to high inference costs and latency.

The survey categorizes efficient reasoning methods into three main areas:

  1. Model-based Efficient Reasoning: Focuses on modifying the model itself.
  2. Reasoning Output-based Efficient Reasoning: Modifies the generation process during inference.
  3. Input Prompts-based Efficient Reasoning: Leverages characteristics of the input prompt.

The survey also covers related topics:

Finally, the paper touches upon applications in areas like autonomous driving, embodied AI, and healthcare, and discusses broader challenges such as the trade-off between safety and efficiency, and the relative merits of RL versus SFT for achieving efficient reasoning. It concludes by emphasizing the practical importance and economic value of developing efficient reasoning capabilities in LLMs.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 11 tweets with 112 likes about this paper.