Papers
Topics
Authors
Recent
Search
2000 character limit reached

Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

Published 10 Jun 2025 in cs.AI | (2506.08872v1)

Abstract: This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed three sessions under the same condition. In a fourth session, LLM users were reassigned to Brain-only group (LLM-to-Brain), and Brain-only users were reassigned to LLM condition (Brain-to-LLM). A total of 54 participants took part in Sessions 1-3, with 18 completing session 4. We used electroencephalography (EEG) to assess cognitive load during essay writing, and analyzed essays using NLP, as well as scoring essays with the help from human teachers and an AI judge. Across groups, NERs, n-gram patterns, and topic ontology showed within-group homogeneity. EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use. In session 4, LLM-to-Brain participants showed reduced alpha and beta connectivity, indicating under-engagement. Brain-to-LLM users exhibited higher memory recall and activation of occipito-parietal and prefrontal areas, similar to Search Engine users. Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.

Citations (15)

Summary

  • The paper demonstrates that reliance on ChatGPT significantly reduces neural connectivity, indicating measurable cognitive offloading during essay writing tasks.
  • Using EEG and NLP analyses, the study shows lower memory encoding and reduced critical thinking in ChatGPT users compared to brain-only and search engine groups.
  • The findings highlight concerns over diminished essay ownership and potential skill atrophy, emphasizing the need for balanced AI integration in education.

This paper investigates the cognitive impact of using a LLM like ChatGPT compared to a traditional web search engine or no external tools ("Brain-only") for essay writing. The study involved 54 participants across three groups (LLM, Search Engine, Brain-only) completing essay writing tasks over four sessions. Data was collected using Electroencephalography (EEG) to measure brain activity, NLP to analyze essays, and post-task interviews. The research aimed to understand how different tools affect essay quality, cognitive load, brain activity patterns, memory, and perceived ownership of the written work.

Experimental Design

The study assigned participants to one of three groups:

  • LLM Group: Used only OpenAI's GPT-4o.
  • Search Engine Group: Used any website except LLMs (primarily Google).
  • Brain-only Group: Used no external tools, relying solely on their knowledge.

Participants completed three essay writing sessions using their assigned tool(s) on different SAT prompts. A subset of 18 participants completed a fourth session where the LLM and Brain-only groups switched conditions (LLM-to-Brain and Brain-to-LLM) and wrote on topics they had previously addressed. Each essay writing task was limited to 20 minutes.

Data collection involved:

  • EEG: Recording brain activity using a 32-electrode headset during the essay writing task.
  • NLP Analysis: Analyzing the written essays for various linguistic features like Named Entity Recognition (NER), n-grams, topic ontology, and calculating similarities/distances between texts.
  • Interviews: Conducting post-session interviews to gather subjective feedback on tool usage, strategy, quoting ability, and essay ownership.
  • Scoring: Essays were scored by human teachers and an AI judge based on metrics like uniqueness, content, language, structure, and organization.

Key Findings

The study revealed significant differences across groups in neural activity, essay characteristics, and participant perceptions:

1. Neural Connectivity Patterns (EEG Analysis)

  • Overall Connectivity: The "Brain-only" group consistently showed the strongest and most widespread neural network connectivity across all measured frequency bands (Alpha, Beta, Theta, Delta). The "Search Engine" group exhibited intermediate connectivity, while the "LLM" group showed the weakest overall coupling. This suggests that relying less on external tools demanded greater internal cognitive coordination.
  • Band-Specific Differences:
    • Alpha (8-12 Hz): Higher in Brain-only, associated with internal attention and semantic processing. Lower in LLM, suggesting less reliance on internally generated ideas. Search Engine showed engagement related to visual attention.
    • Beta (12-30 Hz): Higher overall in Brain-only, reflecting sustained cognitive and motor engagement. Search Engine showed beta linked to visuo-spatial processing (e.g., scrolling). LLM showed some beta, possibly for procedural fluency (typing).
    • Theta (4-8 Hz): Significantly higher in Brain-only, strongly associated with working memory load and executive control. Lower in LLM, consistent with reduced working memory burden due to AI scaffolding. Search Engine showed less extensive theta networking than Brain-only.
    • Delta (0.1-4 Hz): Most pronounced difference, far higher in Brain-only, suggesting recruitment of broad, low-frequency networks for integrative processes, potentially including memory and emotional content. Much weaker in Search Engine and LLM, possibly reflecting a more externally oriented or shallow processing mode.
  • Information Flow: Brain-only showed greater "bottom-up" flow (posterior to frontal), potentially representing internal idea generation. LLM users showed more "top-down" flow (frontal to posterior), suggesting integration and filtering of external (AI) input.
  • Session 4 Insights:
    • LLM-to-Brain: When previously LLM users wrote without tools (Session 4), their neural connectivity was lower than Brain-only participants in earlier sessions (Sessions 2 & 3), especially in Alpha and Beta bands. This indicates reduced engagement in self-driven elaboration and critical scrutiny after prior LLM use, potentially suggesting "skill atrophy."
    • Brain-to-LLM: When previously Brain-only users were introduced to LLMs (Session 4), they showed a significant increase in connectivity across all bands compared to their prior Brain-only sessions (especially Session 1), suggesting high cognitive load related to integrating the new tool's output.

2. Linguistic Analysis (NLP Analysis)

  • Essay Homogeneity: Essays from the LLM group were the most homogeneous within topics, suggesting a convergence towards typical LLM-generated phrasing and structures. Brain-only essays were the most variable.
  • Named Entities (NER): The LLM group used significantly more named entities (people, places, dates), followed by Search Engine, then Brain-only.
  • N-grams: Analysis revealed distinct n-gram patterns per group and topic. For example, Brain-only frequently used more conceptual or introspective phrases ("true happi", "benefit other"), while Search Engine sometimes showed bias towards popular search terms ("homeless person"). LLM-generated text showed a bias towards third-person address. Session 4 analysis indicated that participants sometimes reused vocabulary from their previous tool usage.
  • Ontology: Ontological analysis of essay concepts showed that LLM and Search Engine groups had overlapping conceptual structures, distinct from the Brain-only group.
  • AI Judge vs. Human Teachers: An AI judge tended to give higher scores for uniqueness and content than human teachers, who were more skeptical of AI-generated uniformity and recognized patterns associated with LLM use (e.g., standard structures, lack of personal nuance).

3. Behavioral Insights (Interviews)

  • Quoting Ability: The most striking behavioral difference was in the ability to recall quotes from their essays. LLM users performed significantly worse, especially in early sessions, with many unable to provide any correct quotes. This impairment persisted somewhat in later sessions. Brain-only and Search Engine groups had much better quoting ability and accuracy. This correlates with the neural findings suggesting shallower encoding in the LLM group.
  • Essay Ownership: Brain-only participants reported the highest sense of ownership over their essays. LLM users often reported fragmented or low ownership, feeling dissociated from the tool-generated content. Search Engine users had moderate ownership. This aligns with reduced self-monitoring and evaluation networks in the LLM group.
  • Reflections: LLM users sometimes found the output robotic and felt compelled to edit for personalization. Some questioned the need for AI for certain prompts or felt "analysis-paralysis." Search Engine users appreciated having diverse opinions but felt excluded from AI innovation. Brain-only users valued autonomy and focusing on their own thoughts/experiences. Ethical discomfort regarding AI use was also reported.

Synthesis and Practical Implications

The study concludes that using LLMs for tasks like essay writing, while potentially increasing efficiency and content generation speed (as suggested by homogeneity and NER usage), may come at a significant cognitive cost, leading to "cognitive debt."

  • Cognitive Offloading: LLMs appear to facilitate cognitive offloading, reducing the immediate cognitive load (working memory, executive control) required for deep internal processing, planning, and idea generation, as evidenced by lower neural connectivity in LLM users.
  • Impact on Learning: This offloading may negatively impact key learning processes:
    • Memory: Reduced engagement of memory encoding networks may lead to poorer retention and recall (demonstrated by quoting difficulties).
    • Critical Thinking & Creativity: Lower connectivity in networks associated with self-driven ideation and critical evaluation might result in less unique or critically analyzed content. N-gram patterns and AI/human scoring discrepancies support this.
    • Ownership & Agency: The sense of psychological ownership and cognitive agency over the work appears diminished when relying heavily on external generation.
  • Tool Differences: Search engines promote a different cognitive mode, involving visual scanning and integration of diverse external sources, leading to intermediate cognitive engagement patterns compared to LLMs or Brain-only work.
  • Session 4 Implications: The findings from Session 4 suggest that prior LLM use may hinder subsequent performance on the same task without the tool, as participants show reduced neural engagement compared to those with prior unassisted practice. Conversely, introducing LLMs after initial unassisted practice may induce high cognitive integration, potentially a more beneficial sequence for learning.
  • Energy Cost: The paper also briefly highlights the significantly higher energy consumption of LLM queries compared to search queries, an important environmental and economic consideration.

Limitations and Future Work

The study's limitations include a relatively small sample size from a specific academic demographic, the use of a single LLM (ChatGPT), and a focus solely on the essay writing task in an educational context.

Future work should involve:

  • Larger, more diverse participant samples.
  • Comparison across multiple LLMs and multimodal AI tools.
  • Breaking down tasks into sub-components (e.g., idea generation, drafting, revising) for more granular analysis.
  • Including fMRI to capture deeper brain regions involved in memory and cognition.
  • Longitudinal studies to assess long-term impacts on skill development.
  • Exploring hybrid strategies that balance AI assistance with required self-driven cognitive effort.
  • Developing methods to identify AI-generated text based on stylistic "fingerprinting" of human writing.

Conclusion

The paper concludes that while LLMs offer efficiency benefits, their use in learning tasks like essay writing may lead to the accumulation of cognitive debt. This debt manifests as reduced engagement of neural networks crucial for deep processing, memory formation, and critical thinking, potentially impacting long-term skill development and a sense of ownership over one's work. A careful, balanced approach to integrating AI in education is necessary to leverage its benefits without compromising fundamental cognitive skills and intellectual autonomy.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Glossary

  • Alpha band: An EEG frequency range (about 8–13 Hz) associated with relaxed wakefulness and attentional processes; connectivity in this band can reflect engagement patterns. "EEG Alpha Band"
  • Anterior cingulate cortex (ACC): A medial frontal brain region involved in cognitive control, error monitoring, and motivation during decision-making. "ACC neurons predict the timing of information availability; they sustain motivation during uncertain outcomes and information seeking."
  • Automated Essay Scoring (AES): Computational methods that assign scores to essays by analyzing linguistic and structural features. "newer approaches combine feedback generation with automated essay scoring (AES)"
  • Beta band: An EEG frequency range (about 13–30 Hz) linked to active thinking, motor planning, and executive processes; connectivity changes can indicate task engagement. "Beta Band Connectivity"
  • Cognitive agency: The capacity of learners to initiate, control, and take ownership of their cognitive processes and outputs. "Essay Ownership and Cognitive Agency."
  • Cognitive Load Theory (CLT): A framework explaining how different types of mental effort (intrinsic, extraneous, germane) affect learning and schema acquisition. "Cognitive Load Theory (CLT), developed by John Sweller [30], provides a framework for understanding the mental effort required during learning and problem-solving."
  • Cognitive offloading: The delegation of memory or cognitive processing to external tools or systems, potentially reducing internal cognitive effort. "This cognitive offloading [113] phenomenon raises concerns about the long-term implications for human intellectual development and autonomy [5]."
  • Default Mode Network (DMN): A brain network active during rest and mind-wandering that is typically suppressed during goal-directed tasks. "suppression of the default mode network (DMN), which typically supports mind-wandering and is disengaged during goal-oriented tasks [54]."
  • Delta band: The lowest EEG frequency range (about 0.5–4 Hz), often associated with deep sleep but also examined for connectivity patterns in tasks. "Delta Band Connectivity"
  • Directed Transfer Function (dDTF): A signal-processing method for estimating directional (causal) influences among brain regions from multivariate time series. "The dynamic Direct Transfer Function (dDTF) EEG analysis of Alpha Band for groups: LLM, Search Engine, Brain-only, including p-values to show significance from moderately significant () to highly significant ( ** )."
  • Dorsolateral prefrontal cortex (DLPFC): A frontal brain region central to executive functions like working memory, cognitive control, and planning. "This network includes the dorsolateral prefrontal cortex (DLPFC), dorsal anterior cingulate cortex (ACC), and lateral posterior parietal cortex, which are used for sustained attention and working memory."
  • Echo chamber: An information environment that reinforces existing beliefs by excluding or downplaying opposing views. "Echo chambers represent a significant phenomenon in both traditional search systems and LLMs, where users become trapped in self-reinforcing information bubbles that limit exposure to diverse perspectives."
  • Electroencephalography (EEG): A neurophysiological method that records electrical activity of the brain via scalp electrodes. "We used electroencephalography (EEG) to record participants' brain activity in order to assess their cognitive engagement and cognitive load"
  • Enobio headset: A wearable EEG device used to acquire brain signals in research and applied contexts. "Stage 2: Setup of the Enobio headset."
  • Executive Control Network (ECN): A brain network supporting sustained attention, working memory, and goal-directed control. "During high cognitive workload tasks, physiological responses such as increased heart rate and pupil dilation correlate with neural activity in the executive control network (ECN) [54]."
  • Extraneous cognitive load (ECL): Mental effort imposed by the way information is presented rather than by the content itself, often hindering learning. "extraneous cognitive load (ECL), which refers to the mental effort imposed by presentation of information;"
  • fMRI: Functional magnetic resonance imaging; a technique that measures brain activity by detecting changes associated with blood oxygenation. "Through fMRI, it was found that experienced web users, or 'Net Savvy' individuals, engage significantly broader neural networks compared to those less experienced, the 'Net Naïve' group [51]."
  • Germane cognitive load (GCL): Mental effort devoted to building and automating schemas that directly support learning. "germane cognitive load (GCL), which is the mental effort dedicated to constructing and automating schemas that support learning."
  • Google Effect: The tendency to remember where information can be found rather than the content itself, due to reliance on search engines. "known as the 'Google Effect,' can shift cognitive efforts from information retention to more externalized memory processes [37]."
  • Hippocampus: A medial temporal lobe structure essential for memory formation and spatial navigation. "including the dorsolateral prefrontal cortex, anterior cingulate cortex (ACC), and hippocampus."
  • Intrinsic cognitive load (ICL): Mental effort inherent to the complexity of the material and the learner’s prior knowledge. "intrinsic cognitive load (ICL), which is tied to the complexity of the material being learned and the learner's prior knowledge;"
  • Latent space embeddings: Vector representations learned by models that capture semantic structure in a lower-dimensional space, enabling clustering and similarity analyses. "Latent space embeddings clusters"
  • Multi Trait Specialization (MTS): A framework that improves LLM-based assessment by evaluating multiple writing traits independently. "Multi Trait Specialization (MTS), a framework designed to improve scoring accuracy by decomposing writing proficiency into distinct traits [59]."
  • Named Entity Recognition (NER): An NLP task that identifies and classifies proper names in text (e.g., people, locations, organizations). "Named Entities Recognition (NERs)"
  • N-gram: A contiguous sequence of n items (such as words) from a given text, used for statistical and linguistic analysis. "We discovered a consistent homogeneity across the Named Entities Recognition (NERs), n-grams, ontology of topics within each group."
  • Nominalization: Converting verbs or adjectives into nouns, often increasing syntactic density and formality in writing. "greater sentence depth and nominalization usage [56]."
  • Occipito-parietal: Referring to brain regions spanning occipital and parietal lobes, commonly implicated in visual-spatial processing and attention. "re-engagement of widespread occipito-parietal and prefrontal nodes,"
  • Orbitofrontal cortex: A prefrontal brain region involved in reward valuation, decision-making, and expectation. "activating dopaminergic pathways in regions like the ventral striatum and orbitofrontal cortex [52]."
  • Retrieval-Augmented Generation (RAG): A technique where LLM outputs are grounded by retrieving relevant sources, improving factuality and citation. "and 'infuse' it using Retrieval-Augmented Generation (RAG) to link to the sources it determined to be relevant based on the contextual embedding of each source"
  • Search as Learning (SAL): A framework highlighting web search as an active learning process involving understanding and knowledge construction. "The 'Search as Learning' (SAL) framework sheds light on how web searches can serve as powerful educational tools when approached strategically."
  • Self-Determination Theory: A psychological theory positing that autonomy, competence, and relatedness drive motivation and engagement. "Multi-role LLM frameworks... have been shown to enhance student engagement by aligning with Self-Determination Theory [48]."
  • STORM: An AI writing system that automates prewriting by retrieval and multi-perspective question asking for long-form synthesis. "STORM, 'a writing system for the Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking'"
  • Theta band: An EEG frequency range (about 4–8 Hz) associated with memory processing, drowsiness, and certain cognitive control functions. "Theta Band Connectivity"
  • Ventral striatum: A basal ganglia structure central to reward processing, motivation, and reinforcement learning. "activating dopaminergic pathways in regions like the ventral striatum and orbitofrontal cortex [52]."

Open Problems

We found no open problems mentioned in this paper.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1053 tweets with 956086 likes about this paper.

Reddit

  1. New MIT study shows that LLM users consistently underperform at neural, linguistic, and behavioral levels (1572 points, 79 comments) 
  2. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task (135 points, 30 comments) 
  3. MIT Study Reveals Cognitive Decline in Students Using ChatGPT for Essay Writing (98 points, 13 comments) 
  4. MIT-Studie zu Effekten auf kognitive Fähigkeiten durch LLM-Nutzung (33 points, 8 comments) 
  5. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task (18 points, 4 comments) 
  6. Pretty strong evidence for Hasan's anti-AI stance! (14 points, 5 comments) 
  7. Would love Steve's take on this study. It is a small sample study of how LLMs maybe effecting learning and possibly critical thinking. (14 points, 10 comments) 
  8. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task (7 points, 2 comments) 
  9. Take a look for yourself (1 point, 3 comments) 
  10. Accumulation of cognitive debt when using an AI assistant for essay writing task (1 point, 1 comment) 
  11. New MIT study shows that LLM users consistently underperform at neural, linguistic, and behavioral levels (1 point, 0 comments) 
  12. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task (1 point, 0 comments) 
  13. Study Finds That Relying on ChatGPT for Writing May Diminish Brain Activity and Long-Term Memory (1 point, 1 comment) 
  14. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task (1 point, 0 comments) 
  15. Is LLM making us better programmers or just more complacent? (0 points, 12 comments) 
  16. Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task (arXiv:2506.08872) (0 points, 4 comments) 
  17. Cognitive Debt with LLM adoption (0 points, 1 comment)