Papers
Topics
Authors
Recent
Search
2000 character limit reached

Redefining Developer Assistance: Through Large Language Models in Software Ecosystem

Published 9 Dec 2023 in cs.SE and cs.AI | (2312.05626v3)

Abstract: In this paper, we delve into the advancement of domain-specific LLMs with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate technical documentation, enhancing developer capability in software specific tasks. The creation of DevAssistLlama involved constructing an extensive instruction dataset from various software systems, enabling effective handling of Named Entity Recognition (NER), Relation Extraction (RE), and Link Prediction (LP). Our results demonstrate DevAssistLlama's superior capabilities in these tasks, in comparison with other models including ChatGPT. This research not only highlights the potential of specialized LLMs in software development also the pioneer LLM for this domain.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. 2021. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]
  2. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288 [cs.CL]
  3. Can Machines Read Coding Manuals Yet? – A Benchmark for Building Better Language Models for Code Understanding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2022).
  4. Anonymous. 2023. Anonymous. https://doi.org/10.5281/zenodo.8075578
  5. Tom B Brown et al. 2020. Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165 (2020).
  6. Teaching Large Language Models to Self-Debug. arXiv:2304.05128 [cs.CL]
  7. LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding. arXiv:2306.14924 [cs.CL]
  8. Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://lmsys.org/blog/2023-03-30-vicuna/
  9. Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca. arXiv:2304.08177 [cs.CL]
  10. GitHub Copilot AI pair programmer: Asset or Liability? arXiv:2206.15331 [cs.SE]
  11. M Syauqi Haris and Tri Astoto Kurniawan. 2021. Automated Requirement Sentences Extraction from Software Requirement Specification Document. In Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology. Association for Computing Machinery, New York, NY, USA, 142–147. https://doi.org/10.1145/3427423.3427450
  12. LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 [cs.CL]
  13. 65-billion-parameter language model Introducing LLaMA: A foundational. [n. d.]. https://ai.meta.com/blog/large-language-model-llama-meta-ai/
  14. PUnifiedNER: A Prompting-Based Unified NER System for Diverse Datasets. Proceedings of the AAAI Conference on Artificial Intelligence 37, 11 (Jun. 2023), 13327–13335. https://doi.org/10.1609/aaai.v37i11.26564
  15. Software Entity Recognition with Noise-Robust Learning. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE’23). IEEE/ACM.
  16. Language models are unsupervised multitask learners. https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe
  17. The Programmer’s Assistant: Conversational Interaction with a Large Language Model for Software Development. In Proceedings of the 28th International Conference on Intelligent User Interfaces (IUI ’23). Association for Computing Machinery, New York, NY, USA, 491–514. https://doi.org/10.1145/3581641.3584037
  18. An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation. arXiv:2302.06527 [cs.SE]
  19. Code and Named Entity Recognition in StackOverflow. In The Annual Meeting of the Association for Computational Linguistics (ACL).
  20. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.
  21. EcoAssistant: Using LLM Assistant More Affordably and Accurately. arXiv preprint arXiv:2310.03046 (2023).
  22. UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition. (2023). arXiv:2308.03279 [cs.CL]

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.