Papers
Topics
Authors
Recent
Search
2000 character limit reached

SLM-SQL: An Exploration of Small Language Models for Text-to-SQL

Published 30 Jul 2025 in cs.CL | (2507.22478v1)

Abstract: LLMs have demonstrated strong performance in translating natural language questions into SQL queries (Text-to-SQL). In contrast, small LLMs (SLMs) ranging from 0.5B to 1.5B parameters currently underperform on Text-to-SQL tasks due to their limited logical reasoning capabilities. However, SLMs offer inherent advantages in inference speed and suitability for edge deployment. To explore their potential in Text-to-SQL applications, we leverage recent advancements in post-training techniques. Specifically, we used the open-source SynSQL-2.5M dataset to construct two derived datasets: SynSQL-Think-916K for SQL generation and SynSQL-Merge-Think-310K for SQL merge revision. We then applied supervised fine-tuning and reinforcement learning-based post-training to the SLM, followed by inference using a corrective self-consistency approach. Experimental results validate the effectiveness and generalizability of our method, SLM-SQL. On the BIRD development set, the five evaluated models achieved an average improvement of 31.4 points. Notably, the 0.5B model reached 56.87\% execution accuracy (EX), while the 1.5B model achieved 67.08\% EX. We will release our dataset, model, and code to github: https://github.com/CycloneBoy/slm_sql.

Summary

  • The paper presents the SLM-SQL framework that leverages small language models (0.5B-1.5B parameters) to enhance Text-to-SQL generation.
  • It employs specialized datasets along with supervised and reinforcement learning (using GRPO) to improve logical reasoning and SQL query accuracy.
  • Experimental results show improvements of up to 31.4 points on benchmark datasets, demonstrating practical efficiency in resource-constrained environments.

Exploring SLM-SQL: Small LLMs in Text-to-SQL

The paper "SLM-SQL: An Exploration of Small LLMs for Text-to-SQL" (2507.22478) presents an innovative approach to enhance the Text-to-SQL capabilities using Small LLMs (SLMs), specifically targeting models with parameter counts between 0.5B to 1.5B. While LLMs dominate this domain with their expansive processing power and reasoning capabilities, SLMs offer notable advantages such as faster inference times and suitability for deployment in resource-constrained environments such as edge devices. This research addresses the current performance gap for SLMs in logical reasoning tasks by leveraging advanced methodological frameworks and datasets.

Methodology and Framework

The SLM-SQL framework comprises three primary components: dataset construction, model training, and inference.

Dataset Construction: The study starts with the SynSQL-2.5M dataset as a foundation. From this, two specialized datasets, SynSQL-Think-916K and SynSQL-Merge-Think-310K, are derived. These datasets are tailored for SQL generation and SQL merge revision, respectively. Special preprocessing techniques, including heuristic rule application and chain-of-thought structured annotations, enhance the utility of these datasets for precise SQL query generation. Figure 1

Figure 1: Overview of the proposed SLM-SQL framework.

Model Training: The training regimen involves both supervised fine-tuning (SFT) on the enhanced datasets and reinforcement learning (RL) techniques. The paper employs a Group Relative Policy Optimization (GRPO) algorithm to improve SQL generation capabilities post-training. The RL component notably enhances the logical reasoning potential of SLMs, a crucial area where smaller models generally underperform compared to their larger counterparts.

Inference: For SQL query prediction, the SLM-SQL method uses a corrective self-consistency technique. This involves parallel sampling of SQL generation, followed by a voting mechanism to refine the SQL output through an additional merge revision stage when inconsistencies arise. This approach ensures maximum correctness and redundancy reduction in the generated queries. Figure 2

Figure 2: Trend chart of various metrics of SLM-SQL-1.5B under different sampling numbers on the BIRD development set.

Experimental Results and Performance

The experimental evaluation is conducted on the BIRD and Spider datasets, comparing SLM-SQL against several baseline models. Results show an improvement of up to 31.4 points across the evaluated models, with the 1.5B model achieving a 67.08% execution accuracy on the BIRD dataset. This demonstrates the effectiveness of the SLM-SQL framework in enhancing the performance of SLMs to a level comparable or superior to some larger, high-powered models currently used in Text-to-SQL tasks. Figure 3

Figure 3: The execution accuracy of different models on the BIRD development set.

Implications and Future Directions

This exploration into SLM-SQL signifies an important step forward in the practical feasibility of deploying smaller models for complex tasks like Text-to-SQL, especially in edge computing scenarios where efficiency and resource consumption are critical constraints. The success of SLM-SQL points toward the potential of SLMs to handle structured query language tasks effectively, offering broader deployment options without the overhead of large-scale computation tied to LLMs.

Moving forward, research could explore the integration of adaptive sampling strategies and more sophisticated RL-based training paradigms. Additionally, investigating cross-domain generalization capabilities of refined SLM models could further cement their usability across diverse applications beyond database querying.

Conclusion

The SLM-SQL framework provides a robust method for enhancing the capabilities of small LLMs in the Text-to-SQL task. By constructing specialized datasets and leveraging advanced training methodologies, the research overcomes key obstacles in logical reasoning and output correctness for smaller-scale models. This work paves the way for increased model efficiency, offering compelling performance gains that could significantly broaden the application scope of SLMs in real-world database interactions and other structured text-based tasks.

Paper to Video (Beta)

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.