- The paper presents an empirical survey of GenAI adoption among 457 SE researchers, revealing a 75% usage rate and highlighting systemic pressures to integrate AI in research.
- The study develops a comprehensive taxonomy of GenAI use cases across the research pipeline, noting higher adoption in writing and dissemination while core methods remain human-driven.
- The findings underline critical tensions between efficiency gains and quality concerns, emphasizing risks like inaccuracy and erosion of competency that necessitate robust human oversight and regulation.
Generative AI’s Influence on Software Engineering Research: An Empirical Survey
Study Motivation and Design
Trinkenreich et al. present an empirical investigation of Generative AI (GenAI) adoption and perceptions among researchers in software engineering (SE), motivated by the rapid proliferation of GenAI tools and the absence of discipline-specific longitudinal or large-scale studies on its effects and practices. The authors administered an extensive survey to 457 SE researchers across top-tier publication venues (2023–2025), targeting usage patterns, motivations, trust, perceived risks, and governance perspectives.
The survey design achieves coverage across the SE research life-cycle, explicitly quantifying GenAI utilization across methodological strategies and pipeline stages and capturing nuanced, open-ended researcher perspectives on challenges, regulation, and education. This approach enables a holistic, multi-faceted characterization of how GenAI is reshaping both SE research artifacts and processes.
Patterns of GenAI Adoption
The study finds widespread adoption, with nearly 75% of respondents reporting the use of GenAI for research functions. Adoption is especially pronounced among early-career researchers and those based in Asia and North America. Over half of active researchers feel systemic pressure to link their work to GenAI or collaborate with AI researchers. There is a clear consensus that GenAI’s influence will intensify: 85% expect substantial impact within five years.
Figure 1: Distribution of SE researchers’ GenAI activities shows strong adoption among research and non-research tasks.
Figure 2: Perceived past, near-term, and long-term impact of GenAI on SE research, with expectations peaking around the 5-year horizon.
Figure 3: Levels of felt pressure among SE researchers to adopt and adapt research in response to GenAI trends.
Usage Concentrated in Early and End-Stage Research Activities
The data reveals unequal distribution of GenAI use across the research process. Adoption is highest in writing, summarization, and coding tasks, frequently supporting manuscript refinement, translation, and report generation. Early-stage tasks such as brainstorming and literature review are also frequently augmented, whereas core methodological phases—including empirical research design and advanced data analysis—remain predominantly human-driven.
Figure 4: GenAI adoption rates by research strategy and pipeline stage, highlighting concentration in writing and dissemination phases.
The study confirms that, although SE researchers leverage LLMs for annotation and exploratory analysis, critical interpretation, design, and data collection largely rely on human expertise. This suggests that the epistemic core and methodological validity of SE research remain guarded domains for human oversight.
Taxonomy of GenAI Use Cases, Opportunities, and Peer Review
A comprehensive taxonomy of SE-specific GenAI use cases is established, structured along the research pipeline and encompassing both deductive categories (after Andersen et al.) and user-emergent categories such as “GenAI as Research Topic,” cross-cutting consultative roles, and peer review. Notably, GenAI is frequently positioned as an assistant for ideation and feedback rather than an autonomous decision-maker.
Figure 5: Taxonomy of GenAI use cases in SE research, distinguishing traditional and emergent roles.
Figure 6: Detailed breakdown of GenAI utilization in peer review, including support in review writing, verification, and content summarization.
The potential for GenAI to support fairness in reviewer assignment and metareview is acknowledged, but most respondents are skeptical about LLMs autonomously evaluating scientific merit. There is a strong consensus on restricting GenAI’s reviewing role to language, tonality, and summary support, explicitly prohibiting its use for final judgments or core evaluation.
Benefits and Contradictions: Efficiency Gains vs. Quality Concerns
GenAI is widely perceived as a productivity amplifier, especially aiding non-native English speakers (91% agreement) and automating routine research tasks such as tool/script generation and data labeling. Timely manuscript and administrative processing also benefit. However, strong ambivalence emerges around its value for creativity, hypothesis generation, and peer review, where most respondents doubt its reliability and epistemic safety.
Figure 7: Perceived benefits, with strong support for linguistic and low-level technical utility, skeptically viewed for creative and evaluative tasks.
Central Risks: Inaccuracy, Integrity, and Erosion of Competency
Researchers uniformly express concerns about accuracy, bias, hallucinated information, and the risk of quality degradation in GenAI-augmented SE research. The risks extend to the potential erosion of fundamental research competencies, especially for trainees—an explicit “competence pipeline” risk. There is a pronounced concern that overreliance on GenAI could undermine both the reliability of published work and skill development for future researchers.
Figure 8: Challenges reported with GenAI, dominated by issues of trust, uncertainty, and unclear regulatory environments.
Figure 9: Risks perceived in the use of GenAI for SE research, with mistakes, hallucinations, and transparency dominating.
Trust, Mitigation, and Human Oversight
Trust in GenAI tools is generally low for methodological and analytic functions (e.g., only 12% express confidence in GenAI’s reliability for research), and is highest for language-centric, late-stage tasks. The dominant mitigation strategy is sustained “human in the loop” oversight and manual verification for all critical tasks. Researchers also call for clear task-boundary guidelines, transparency about usage, and improved community education.
Figure 10: Multi-dimensional model of trust in GenAI among researchers, with highest trust in language support roles.
Figure 11: Activity-based trust distribution: trust in GenAI is highest in writing, lowest in conception/methodology.
Figure 12: Distribution of risk mitigation approaches: human oversight is paramount, with transparency and education as reinforcing mechanisms.
Governance, Policy, and the Path Forward
The majority of researchers advocate for explicit regulation—albeit context-sensitive, principle-based, and not overly prescriptive—emphasizing transparency, accountability, and institutional alignment with evolving social and technological norms. Peer review is identified as the most contentious domain for GenAI governance, with positions ranging from outright prohibition to conditional acceptance under documented, circumscribed use.
Tensions in Practice and Key Implications
The paper synthesizes a set of tensions that are now structural in SE research with GenAI: the productivity-effort tension (efficiency gains require significant critical engagement and training), the productivity-quality tension (rapid output risks research integrity), and the competence pipeline risk (overuse by early-career researchers may erode foundational skills). Overreliance on GenAI could destabilize trust in SE research outputs and undermine long-term community standards.
Outlook: Recommendations and Future Developments
The research highlights several actionable imperatives:
- Evolving guidelines and standards for responsible GenAI use—grounded in transparency, verification, and accountability—are needed at the SE community level.
- Graduate programs should explicitly supplement GenAI literacy with foundational research practice and critical thinking, preventing premature overdelegation of intellectual labor.
- Longitudinal, cross-disciplinary studies are recommended to monitor adoption, risk, and governance effects as GenAI capabilities and research norms co-evolve.
Conclusion
This paper delivers a comprehensive, evidence-driven baseline for understanding GenAI’s heterogeneous and evolving impact on SE research. Its quantitative and qualitative findings underpin empirically-derived taxonomies of use, benefit, risk, and regulation, and clarify the necessity for human oversight and community-driven governance. As GenAI continues to advance, ongoing vigilance is required to preserve research integrity, foster equitable adoption, and maintain the epistemic foundations of the SE field.
Citation: "Taking a Pulse on How Generative AI is Reshaping the Software Engineering Research Landscape" (2604.11184)