Papers
Topics
Authors
Recent
Search
2000 character limit reached

AIGCodeSet: A New Annotated Dataset for AI Generated Code Detection

Published 21 Dec 2024 in cs.SE and cs.AI | (2412.16594v3)

Abstract: While LLMs provide significant convenience for software development, they can lead to ethical issues in job interviews and student assignments. Therefore, determining whether a piece of code is written by a human or generated by an AI model is a critical issue. In this study, we present AIGCodeSet, which consists of 2.828 AI-generated and 4.755 human-written Python codes, created using CodeLlama 34B, Codestral 22B, and Gemini 1.5 Flash. In addition, we share the results of our experiments conducted with baseline detection methods. Our experiments show that a Bayesian classifier outperforms the other models.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.