Modularization is Better: Effective Code Generation with Modular Prompting

Published 16 Mar 2025 in cs.SE | (2503.12483v1)

Abstract: LLMs are transforming software development by automatically generating code. Current prompting techniques such as Chain-of-Thought (CoT) suggest tasks step by step and the reasoning process follows a linear structure, which hampers the understanding of complex programming problems, particularly those requiring hierarchical solutions. Inspired by the principle of modularization in software development, in this work, we propose a novel prompting technique, called MoT, to enhance the code generation performance of LLMs. At first, MoT exploits modularization principles to decompose complex programming problems into smaller, independent reasoning steps, enabling a more structured and interpretable problem-solving process. This hierarchical structure improves the LLM's ability to comprehend complex programming problems. Then, it structures the reasoning process using an MLR Graph (Multi-Level Reasoning Graph), which hierarchically organizes reasoning steps. This approach enhances modular understanding and ensures better alignment between reasoning steps and the generated code, significantly improving code generation performance. Our experiments on two advanced LLMs (GPT-4o-mini and DeepSeek-R1), comparing MoT to six baseline prompting techniques across six widely used datasets, HumanEval, HumanEval-ET, HumanEval+, MBPP, MBPP-ET, and MBPP+, demonstrate that MoT significantly outperforms existing baselines (e.g., CoT and SCoT), achieving Pass@1 scores ranging from 58.1% to 95.1%. The experimental results confirm that MoT significantly enhances the performance of LLM-based code generation.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a novel MoT prompting method that decomposes complex code generation tasks into manageable modules using hierarchical MLR Graphs.
The methodology significantly boosts performance with Pass@1 scores ranging from 58.1% to 95.1%, enhancing code accuracy and maintainability.
The approach aligns with modular software practices, reducing computational costs while improving efficiency and scalability in code generation.

Modularization is Better: Effective Code Generation with Modular Prompting

Introduction

The paper "Modularization is Better: Effective Code Generation with Modular Prompting" introduces a novel approach, Modularization-of-Thought (MoT) prompting, to enhance code generation in LLMs. Traditional code generation techniques like Chain-of-Thought (CoT) prompting suffer from limitations due to their linear, monolithic structure, which hampers the understanding of complex problems that require hierarchical solutions. MoT aims to address these limitations by incorporating modularization principles, thereby enabling a structured, interpretable problem-solving process through the construction of Multi-Level Reasoning (MLR) Graphs.

Approach

Modularization-of-Thought (MoT) Prompting

MoT adopts a hierarchical task decomposition approach inspired by software modularization principles. It attempts to break down complex programming problems into smaller, manageable tasks using MLR Graphs which capture the modular nature of programming problems, thereby enhancing problem comprehension and code generation quality.

MLR Graph Generation Phase: The model constructs an MLR Graph during this phase, which represents hierarchical task decomposition and modular understanding. It captures high-level, intermediate-level, and detailed-level task designs, thereby helping LLMs understand and organize reasoning steps hierarchically.
Code Generation Phase: Guided by the hierarchical MLR Graphs, LLMs progressively generate code, which significantly improves code accuracy. This phase ensures that the generated code aligns well with the modular understanding achieved through the MLR Graphs.
Figure 1: The Overview of MoT.

Evaluation

The paper evaluates the MoT technique using two state-of-the-art LLMs, GPT-4o-mini and DeepSeek-R1, on six datasets: HumanEval, HumanEval+, HumanEval-ET, MBPP, MBPP+, and MBPP-ET. The results indicate a significant improvement over existing prompting techniques, such as CoT and SCoT, achieving Pass@1 scores between 58.1% and 95.1%.

Performance Metrics: Metrics such as Pass@1 and Average Pass Ratio (APR) were employed to assess performance. MoT outperformed the baselines in both metrics across all datasets, highlighting its effectiveness.
Figure 2: Comparison of Average Generation Time for Different Prompt Techniques on the HumanEval Dataset.

Discussion

Effectiveness of MLR Graphs and Modularization

The MoT technique's effectiveness lies in its modular approach, allowing complex problems to be broken down into manageable parts, thus facilitating structured problem-solving. The MLR Graphs guide LLMs in understanding task logic and maintaining coherence during code generation.

Code Maintainability: MoT aligns with software engineering practices like modular decomposition, promoting code that is not only higher quality but also more maintainable and scalable.
Figure 3: An example for the effectiveness of MLR Graph.

Cost Implications

The study finds MoT to be efficient, with lower computational costs compared to other advanced prompting techniques. Its design reduces unnecessary steps in reasoning processes, leading to cost-effective yet high-quality code generation.

Conclusion

The study concludes that MoT significantly enhances LLM performance in code generation by leveraging modularization principles. The approach offers a promising pathway not only for improving accuracy and maintainability but also for reducing computational expenses in solving complex programming tasks. As future work, expanding MoT's application across various programming languages and exploring additional programming paradigms could further validate its versatility and robustness.