Papers
Topics
Authors
Recent
Search
2000 character limit reached

Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis

Published 18 Sep 2024 in cs.SE, cs.LG, and cs.LO | (2410.19736v1)

Abstract: In the past few years, LLMs have exploded in usefulness and popularity for code generation tasks. However, LLMs still struggle with accuracy and are unsuitable for high-risk applications without additional oversight and verification. In particular, they perform poorly at generating code for highly complex systems, especially with unusual or out-of-sample logic. For such systems, verifying the code generated by the LLM may take longer than writing it by hand. We introduce a solution that divides the code generation into two parts; one to be handled by an LLM and one to be handled by formal methods-based program synthesis. We develop a benchmark to test our solution and show that our method allows the pipeline to solve problems previously intractable for LLM code generation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Syntax-guided synthesis. IEEE.
  2. Parameterized synthesis case study: AMBA AHB (extended version). arXiv preprint arXiv:1406.7608.
  3. Language Models are Few-Shot Learners. arXiv:2005.14165.
  4. Evaluating Large Language Models Trained on Code. arXiv:2107.03374.
  5. Program synthesis for musicians: A usability testbed for temporal logic specifications. In Programming Languages and Systems: 19th Asian Symposium, APLAS 2021, Chicago, IL, USA, October 17–18, 2021, Proceedings 19, 47–61. Springer.
  6. Church, A. 1962. Logic, arithmetic and automata. In Proceedings of the international congress of mathematicians, volume 1962, 23–35.
  7. nl2spec: Interactively translating unstructured natural language to temporal logics with large language models. In International Conference on Computer Aided Verification, 383–396. Springer.
  8. Temporal stream logic: Synthesis beyond the bools. In International Conference on Computer Aided Verification. Springer.
  9. Syntroids: Synthesizing a game for fpgas using temporal logic specifications. In 2019 Formal Methods in Computer Aided Design (FMCAD), 138–146. IEEE.
  10. Gulwani, S. 2011. Automating string processing in spreadsheets using input-output examples. ACM Sigplan Notices, 46(1): 317–330.
  11. The 5th reactive synthesis competition—SYNTCOMP 2018. In SYNT workshop at FLoC.
  12. Swe-bench: Can language models resolve real-world github issues? arXiv preprint arXiv:2310.06770.
  13. Lang2ltl: translating natural language commands to temporal specification with large language models. In Workshop on Language and Robotics at CoRL 2022.
  14. Using Reactive Synthesis: An End-to-End Exploratory Case Study. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), 742–754. IEEE.
  15. Mealy, G. H. 1955. A method for synthesizing sequential circuits. The Bell System Technical Journal, 34(5): 1045–1079.
  16. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? arXiv:2202.12837.
  17. GPT-4 Technical Report. arXiv:2303.08774.
  18. Towards the Usability of Reactive Synthesis: Building Blocks of Temporal Logic. In Plateau Workshop.
  19. Enforcing Temporal Constraints on Generative Agent Behavior with Reactive Synthesis. arXiv preprint arXiv:2402.16905.
  20. Code Llama: Open Foundation Models for Code. arXiv:2308.12950.
  21. Magicoder: Empowering Code Generation with OSS-Instruct. In Forty-first International Conference on Machine Learning.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 10 likes about this paper.