Papers
Topics
Authors
Recent
Search
2000 character limit reached

Exploring Data-Efficient Adaptation of Large Language Models for Code Generation

Published 29 Feb 2024 in cs.SE, cs.AI, and cs.CL | (2403.00046v3)

Abstract: Although LLMs have made significant progress in code generation, they still struggle with code generation tasks in specific scenarios. These scenarios usually necessitate the adaptation of LLMs to fulfill specific needs, but the limited training data available in practice leads to poor code generation performance. Therefore, how to effectively adapt LLMs to new scenarios with few training data is a major challenge for current code generation. In this paper, we propose a novel adaptation approach named DEED, which stands for Data-Efficient adaptation with Error-Driven learning for code generation. DEED leverages the errors made by LLMs as learning opportunities, using error revision to overcome their own shortcomings, thus achieving efficient learning. Specifically, DEED involves identifying error code generated by LLMs, employing Self-Revise for code revision, optimizing the model with revised code, and iteratively adapting the process for continuous improvement. Experimental results show that, compared to other mainstream fine-tuning approaches, DEED achieves superior performance with few training data, showing an average relative improvement of 46.2% in Pass@1 on multiple code generation benchmarks. We also validate the effectiveness of Self-Revise, which generates revised code that optimizes the model more efficiently compared to the code samples from datasets. Moreover, DEED consistently demonstrates strong performance across various LLMs, underscoring its applicability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. A learning algorithm for boltzmann machines. Cogn. Sci., 9(1):147–169.
  2. Program synthesis with large language models. CoRR, abs/2108.07732.
  3. Language models are few-shot learners. In NeurIPS.
  4. Generalized accept-reject sampling schemes. Lecture Notes-Monograph Series, pages 342–347.
  5. Improving code generation by training with natural language feedback. CoRR, abs/2303.16749.
  6. Personalised distillation: Empowering open-sourced llms with adaptive learning for code generation. CoRR, abs/2310.18628.
  7. Evaluating large language models trained on code. CoRR.
  8. Teaching large language models to self-debug. CoRR, abs/2304.05128.
  9. Jeffrey Dean and Sanjay Ghemawat. 2008. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107–113.
  10. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nat. Mac. Intell., 5(3):220–235.
  11. A survey for in-context learning. CoRR, abs/2301.00234.
  12. Codescore: Evaluating code generation by learning code execution. CoRR, abs/2301.09043.
  13. Self-collaboration code generation via chatgpt. CoRR, abs/2304.07590.
  14. Generalization or memorization: Data contamination and trustworthy evaluation for large language models. CoRR, abs/2402.15938.
  15. CODEP: grammatical seq2seq model for general-purpose code generation. In ISSTA, pages 188–198. ACM.
  16. Incoder: A generative model for code infilling and synthesis. CoRR, abs/2204.05999.
  17. The curious case of neural text degeneration. In ICLR. OpenReview.net.
  18. Parameter-efficient transfer learning for NLP. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR.
  19. Lora: Low-rank adaptation of large language models. In ICLR. OpenReview.net.
  20. Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models. In EMNLP, pages 5254–5276. Association for Computational Linguistics.
  21. Fine-tuning can distort pretrained features and underperform out-of-distribution. In ICLR. OpenReview.net.
  22. DS-1000: A natural and reliable benchmark for data science code generation. In ICML, volume 202 of Proceedings of Machine Learning Research, pages 18319–18345. PMLR.
  23. The power of scale for parameter-efficient prompt tuning. In EMNLP (1), pages 3045–3059. Association for Computational Linguistics.
  24. Starcoder: may the source be with you! CoRR, abs/2305.06161.
  25. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In ACL/IJCNLP (1), pages 4582–4597. Association for Computational Linguistics.
  26. Competition-level code generation with alphacode. Science, 378(6624):1092–1097.
  27. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv., 55(9):195:1–195:35.
  28. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. CoRR, abs/2110.07602.
  29. GPT understands, too. CoRR, abs/2103.10385.
  30. Ilya Loshchilov and Frank Hutter. 2017. Fixing weight decay regularization in adam. CoRR, abs/1711.05101.
  31. Wizardcoder: Empowering code large language models with evol-instruct. CoRR, abs/2306.08568.
  32. Codegen: An open large language model for code with multi-turn program synthesis. In ICLR. OpenReview.net.
  33. OpenAI. 2022. ChatGPT.
  34. Improving language understanding by generative pre-training.
  35. Code llama: Open foundation models for code. CoRR, abs/2308.12950.
  36. Incorporating domain knowledge through task augmentation for front-end javascript code generation. In ESEC/SIGSOFT FSE, pages 1533–1543. ACM.
  37. Llama: Open and efficient foundation language models. CoRR, abs/2302.13971.
  38. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288.
  39. Fine-tuning pre-trained language models effectively by optimizing subnetworks adaptively. In NeurIPS.
  40. Self-edit: Fault-aware code editor for code generation. In ACL (1), pages 769–787. Association for Computational Linguistics.
  41. Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x. CoRR, abs/2303.17568.
Citations (2)

Summary

  • The paper's main contribution is the DEED approach, which employs error-driven learning to enhance LLM performance with limited training data.
  • The method iteratively collects errors, automatically revises codes, and fine-tunes models, achieving superior results on benchmarks such as MBPP and HumanEval.
  • The approach outperforms traditional fine-tuning techniques, offering a promising trajectory for efficient model adaptation in data-scarce environments.

Exploring Data-Efficient Adaptation of LLMs for Code Generation

Introduction

The field of code generation, which leverages LLMs to translate human requirements expressed in natural language into executable code, has witnessed substantial advancements. Despite these advancements, LLMs often face challenges when dealing with specific scenarios, particularly when training data is limited due to industry constraints or resource scarcity. This limitation leads to suboptimal performance, highlighting the necessity for effective adaptation techniques. The paper "Exploring Data-Efficient Adaptation of LLMs for Code Generation" (2403.00046) introduces a novel adaptation method termed Data-Efficient adaptation with Error-Driven learning (hereafter referred to as DEED) aimed at enhancing LLM performance even with scarce training data.

Methodology

DEED leverages an error-driven learning approach to optimize LLMs through four iterative steps: Error Code Collection, Automatic Code Revision, Model Optimization, and Iterative Adaptation. The process begins by identifying erroneous outputs from LLMs and using these errors as learning opportunities to refine models, ultimately improving their performance with minimal data. This strategy differs from traditional fine-tuning methods by focusing on critical error revisions instead of exhaustive learning from complete datasets. Figure 1

Figure 1: An overview of the proposed DEED and its differences from traditional fine-tuning methods.

Error Code Collection

The initial step involves collecting error codes generated by LLMs, using rejection sampling guided by test criteria. The model-generated outputs that fail to meet specified conditions are identified as error codes, providing insights into the weaknesses of LLMs.

Automatic Code Revision

Automatic Code Revision is central to DEED's approach, wherein erroneous codes are revised using a method named . This involves combining various inputs, including requirements, error codes, test feedback, and correct solutions from datasets, to generate revised codes that overcome identified errors. Figure 2

Figure 2: Illustration of automatic code revision.

Model Optimization

Revised codes are then utilized to optimize the base model through fine-tuning, enabling the model to focus on learning corrections from critical errors. This iterative adaption enhances the model's proficiency in specific scenarios with limited data.

Iterative Adaptation

The iterative process continues until successive rounds yield diminishing improvements, ensuring a stable model training process by leveraging data from previous iterations.

Evaluation

The paper presents extensive evaluations across multiple public code generation benchmarks: HumanEval, MBPP, and DataScience, among others. DEED demonstrates considerable relative improvements across these datasets, outperforming mainstream adaptation methods like full-parameter and LoRA fine-tuning, as well as prompting techniques. Figure 3

Figure 3: The performance of direct generation, fine-tuning, and DEED on MBPP dataset under the circumstance of limited data. The numbers on the bars indicate the training data used by different methods.

Implications and Future Work

DEED's approach of using error revision offers a promising avenue for efficiently adapting LLMs to new scenarios without the need for large datasets. The results indicate potential applications in industries where training samples are scarce. Additionally, incorporating LLMs' revisions into training processes could redefine adaptation techniques across various domains.

In future work, exploring the applicability of DEED in project-level code generation with domain-specific evaluations could further enhance its robustness and expand its utility in real-world applications. Improved methods for test case generation to facilitate the error-driven learning process could also strengthen model adaptation strategies. Figure 4

Figure 4: Performance analysis with varying sizes of training data on MBPP dataset.

Conclusion

The DEED approach significantly refines code generation performance under limited data conditions, offering an effective and efficient trajectory for model adaptation. By harnessing the potential of error-driven learning, DEED sets a precedent for future explorations in enhancing LLM efficacy, particularly in data-constrained environments. Its ability to consistently improve performance across different LLMs underscores the versatility and applicability of this method in advancing artificial intelligence capabilities.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.