Profiling Software Developers with Process Mining and N-Gram Language Models
Abstract: Context: Profiling developers is challenging since many factors, such as their skills, experience, development environment and behaviors, may influence a detailed analysis and the delivery of coherent interpretations. Objective: We aim at profiling software developers by mining their software development process. To do so, we performed a controlled experiment where, in the realm of a Python programming contest, a group of developers had the same well-defined set of requirements specifications and a well-defined sprint schedule. Events were collected from the PyCharm IDE, and from the Mooshak automatic jury where subjects checked-in their code. Method: We used n-gram LLMs and text mining to characterize developers' profiles, and process mining algorithms to discover their overall workflows and extract the correspondent metrics for further evaluation. Results: Findings show that we can clearly characterize with a coherent rationale most developers, and distinguish the top performers from the ones with more challenging behaviors. This approach may lead ultimately to the creation of a catalog of software development process smells. Conclusions: The profile of a developer provides a software project manager a clue for the selection of appropriate tasks he/she should be assigned. With the increasing usage of low and no-code platforms, where coding is automatically generated from an upper abstraction layer, mining developer's actions in the development platforms is a promising approach to early detect not only behaviors but also assess project complexity and model effort.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.