Artificial Intelligence in Science: Returns, Reallocation, and Reorganization

Published 30 Mar 2026 in physics.soc-ph and econ.GN | (2603.27956v1)

Abstract: Investment in AI has grown rapidly, yet its returns to scientific research remain poorly understood. We study how AI reshapes the production of science using a comprehensive dataset of research proposals submitted to a large international funding agency, including both funded and unfunded projects. Combining keyword extraction with LLM classification, we identify the presence, type, and functional role of AI within each proposal and link these measures to detailed budget allocations, team structure, and subsequent publication outcomes. We find that, in the short run, AI adoption is associated with modest improvements in scientific outcomes concentrated in the upper tail. Instead, its primary effects arise in the organization of research: AI-enabled projects reallocate resources toward human capital, involve larger teams, and undertake a broader set of tasks. These patterns are consistent with a reorganization of the scientific production process rather than immediate efficiency gains, in line with theories of general-purpose technologies. Task-level analyses further show that activities expanded in AI-enabled projects, particularly ideation and experimentation, are increasingly compatible with LLM capabilities, suggesting potential for future productivity gains as these technologies mature.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper demonstrates that AI-enabled projects yield modest short-term output gains while achieving higher maximum impact metrics for select cases.
The paper's novel LLM-driven pipeline effectively distinguishes AI roles and reallocates resources, emphasizing increased investment in human capital.
The study reveals that AI integration expands team sizes and task breadth, reconfiguring research processes without immediate efficiency gains.

Artificial Intelligence in Science: Returns, Reallocation, and Reorganization

Overview and Research Objectives

The paper "Artificial Intelligence in Science: Returns, Reallocation, and Reorganization" (2603.27956) presents a large-scale, empirical study quantifying the short-run impacts of AI adoption on the scientific research enterprise. Utilizing a comprehensive dataset of both funded and unfunded research proposals submitted to a major international medical and bioscience funding agency, the authors employ advanced NLP to systematically identify AI-related content, specify types and functional roles of AI, and link these to project structure, resource allocation, and output metrics. Core to the investigation is the tension between expectations of AI as a general-purpose technology (GPT)—promising both efficiency and revolutionary restructuring of R&D workflows—and the actual empirical outcomes in diverse, real-world settings.

Methodology: Detection of AI Technologies and Roles in Research Proposals

A critical methodological contribution lies in the multi-stage LLM-driven pipeline, combining dictionary-based keyword extraction, manually curated regular expressions, and large LLM (Meta-Llama-3.1-70B-Instruct; Qwen2.5-32B-Instruct) classification to disambiguate AI mentions with respect to algorithmic class and functional role. This approach ensures high-resolution differentiation between modern AI (deep learning, generative models), statistical ML, analytics, and domain-specific computational methods. The system assigns proposed uses to eleven well-defined workflow roles, including ideation, data collection, analysis, experimentation, inference, automation, and productization, filtering out incidental or referential mentions. The pipeline demonstrates moderate reliability (Jaccard similarity = 0.41; average Cohen's K = 0.31), emphasizing the challenge of opaque, proposal-specific expressions.

Budgetary records undergo semantic classification via pretrained sentence-transformer embeddings, aligning line items with a fine-grained taxonomy (personnel, equipment, operational, overhead). Finally, scientific outcomes are measured through post-grant bibliometric matches (publications, citations, JIFs, authorship). Task-level analysis leverages O*NET descriptors extracted with the JAAT classifier, enabling systematic, interpretable comparisons of task breadth and composition.

Empirical Findings: Scientific Returns and Resource Reallocation

Scientific Output

AI-enabled projects—those employing modern AI methods—are associated with only modest improvements in short-term scientific outputs, measured in publications, citations, and journal impact, and only at the extreme upper tail. Regression analysis reveals that, conditional on project and applicant characteristics, funded AI-enabled proposals achieve significantly higher maximum JIF and maximum cited work, but the bulk of AI-enabled proposals do not outperform their semantically matched non-AI controls in overall productivity. Furthermore, project funding rates and requested budgets do not differ systematically for AI and non-AI proposals. This is in contrast to prominent narratives exemplified by cases such as AlphaFold, where AI has transformed productivity in targeted domains.

Reallocation of Inputs and Organizational Restructuring

A primary and robust effect observed is systematic reallocation of resources within the research process rather than immediate efficiency gains:

Budgetary Shifts: AI-enabled projects allocate a larger fraction of funds to human capital—predominantly towards salaries—while proportionally reducing demands for equipment and operational costs. This points to AI complementing skilled labor, not substituting it, and suggests increased complexity in project staffing and coordination.
Team Structure: Team sizes are significantly larger in AI-enabled projects, with downstream effects on co-authorship and collaborative network density.
Task Breadth and Project Duration: AI adoption expands the set of tasks per project and increases project durations. Unlike the AlphaFold paradigm, where experimental phases are replaced by computational inference, AI-enabled projects generally supplement existing workflows with computational tasks, increasing, not substituting, total activity.
Functional Usage: Modern AI is leveraged beyond data analysis and pattern recognition, with sharp upticks observed in ideation, experimentation, and model/product development workflows. These categories are increasingly aligned with tasks strongly exposed to LLM capabilities as quantified by independent exposure metrics (Anthropic LLM exposure scores).

Task-level Decomposition

Task extraction underscores an expansion effect: AI enables new computationally intensive activities without eliminating traditional research components, particularly in a high-constraint domain such as biomedicine. Activities most likely to benefit from maturing LLMs—ideation, model development, advanced experimentation—are precisely those growing in AI proposals. This suggests future productivity gains may be concentrated in these task clusters as AI systems evolve.

Theoretical and Practical Implications

General-Purpose Technology and Transitional Latency

Findings are consistent with GPT-derived "J-curve" models of innovation, where the introduction of a transformative technology induces a transitional phase of reorganization, process redesign, and increased coordination costs before net productivity increases accumulate. The empirical absence of broad, short-run output gains for AI in science mirrors historical experiences with other GPTs, such as electrification and IT, where learning, adaptation, and reconfiguration precede measurable returns [see Brynjolfsson et al., 2021].

Labor and Institutional Impacts

The evidence for reallocation toward human capital underscores persistent complementarities between AI and scientific labor. Rather than hollowing out expertise, current AI trajectories amplify demand for skilled coordination, integration, and interpretation, even as more routine tasks become susceptible to automation. The clustering of output improvements at the upper tail further suggests that AI, at this stage, acts more as a force multiplier for highly resourced, larger teams and may reinforce winner-take-all dynamics in access and productivity. This may have far-reaching consequences for team assembly, research incentives, and equity.

Speculations on Future Developments

The alignment between AI-enabled project activities and high-LLM-exposure tasks indicates that as LLMs mature—expanding capacity in ideation, design, and sophisticated experimental planning—latent productivity accumulations may manifest as step-changes in research throughput and novelty. However, real-world scientific production is constrained by institutional inertia, risk aversion, and domain-specific operational requirements. Therefore, ultimate efficiency gains from AI integration will hinge not just on technological advances, but also on parallel organizational and institutional adaptation.

Conclusion

This study contributes a rigorous, data-driven assessment of AI's role in scientific practice, emphasizing a reorganization of research processes and resource allocation rather than immediate productivity gains. AI-enabled science, in its present mode, is characterized by increased task complexity, larger teams, and labor-intensive integration instead of displacement. These findings reinforce the view that AI is best conceptualized as a general-purpose technology in the midst of a transitional learning phase, with second-order effects on the organization, collaboration, and incentives underlying knowledge production. Realization of latent productivity benefits from AI in science will depend critically on both technological trajectories and adaptive social-institutional processes.

Markdown Report Issue