Brain-like Functional Organization within Large Language Models

Published 25 Oct 2024 in q-bio.NC and cs.AI | (2410.19542v2)

Abstract: The human brain has long inspired the pursuit of AI. Recently, neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli, suggesting that ANNs may employ brain-like information processing strategies. While such alignment has been observed across sensory modalities--visual, auditory, and linguistic--much of the focus has been on the behaviors of artificial neurons (ANs) at the population level, leaving the functional organization of individual ANs that facilitates such brain-like processes largely unexplored. In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs), the foundational organizational structure of the human brain. Specifically, we extract representative patterns from temporal responses of ANs in LLMs, and use them as fixed regressors to construct voxel-wise encoding models to predict brain activity recorded by functional magnetic resonance imaging (fMRI). This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within LLMs. Our findings reveal that LLMs (BERT and Llama 1-3) exhibit brain-like functional architecture, with sub-groups of artificial neurons mirroring the organizational patterns of well-established FBNs. Notably, the brain-like functional organization of LLMs evolves with the increased sophistication and capability, achieving an improved balance between the diversity of computational behaviors and the consistency of functional specializations. This research represents the first exploration of brain-like functional organization within LLMs, offering novel insights to inform the development of artificial general intelligence (AGI) with human brain principles.

Abstract PDF HTML Upgrade to Chat

Authors (10)

References (57)

Summary

The paper introduces a framework that maps temporal responses of artificial neurons in LLMs to fMRI data, revealing a brain-like functional organization.
It employs sparse representation and voxel-wise encoding models to statistically link AN sub-groups with specific functional brain networks.
Results indicate that advanced models, particularly Llama3, demonstrate hierarchical organization and enhanced representational capabilities akin to human neural architectures.

Brain-like Functional Organization within LLMs

Introduction

The investigation into the alignment of artificial neural networks (ANNs) with neural responses of the human brain to external stimuli forms a crucial intersection between AI and neuroscience. The paper "Brain-like Functional Organization within LLMs" explores this alignment by focusing on LLMs and their potential to mirror the functional organization of the human brain's functional brain networks (FBNs). This paper presents methodologies for linking artificial neurons (ANs) in LLMs, such as BERT and Llama models, with FBNs using functional magnetic resonance imaging (fMRI), revealing compelling insights into the brain-like architecture of these models.

Methodology

The authors propose a comprehensive framework to study LLMs by mapping AN sub-groups to FBNs.

Artificial Neurons in LLMs: By focusing on BERT and Llama1-3, ANs are defined in the fully connected layers of the feed-forward network within each transformer block. Their temporal responses to stimuli are aligned with fMRI data using the Narratives fMRI dataset.
Sparse Representation: Representative temporal response patterns of ANs are identified using a sparse representation technique. These patterns form a dictionary used to predict fMRI brain activity, establishing a voxel-wise encoding model that allows linking of AN sub-groups to brain activity.
Figure 1: The study overview. We learn representative patterns $\mathbf{D_AN}$ from the temporal responses of ANs in LLMs and use them to reconstruct fMRI brain activity.
Voxel-wise Encoding Models: The encoding models showcase the importance of representative patterns in reconstructing brain activity. These models are statistically validated to associate AN responses with specific brain regions.

Results

Sparse Representation: The $R^2$ values indicate that temporal responses of ANs are well-represented by the learned dictionary, with distinct variations between BERT and the Llama models. The distribution of these values also suggests that Llama models show better alignment with brain activity compared to BERT.
Figure 2: The $R^2$ values in the sparse representation of temporal responses of ANs (a), and the sparse reconstruction of fMRI activity (b) demonstrate close alignment with fMRI data.
Brain Map Analysis: The brain maps derived from voxel-wise encoding models reveal complex functional interactions among FBNs. Significant FBNs such as the language network, default mode network (DMN), and visual cortex show unique activation patterns across different models (Figure 3).
Figure 4: Example brain maps show relevant FBN activation and deactivation patterns illustrating detailed functional interaction within LLMs.

Figure 3: The distribution of brain maps showcases the involvement and complexity of different FBNs.
Evolution of Functional Organization: It is evident that brain-like functional architecture evolves with increasing sophistication in LLMs, emphasizing hierarchical organization and enhanced representational capabilities in advanced models like Llama3.
Figure 5: FBN involvement depicted by Dice coefficients highlights active and deactivated regions within brain maps, demonstrating varied LLM functionalities.

Figure 6: Temporal consistency and distribution of ANs across LLM layers in functional interactions, showing evolving patterns from BERT to Llama3.

Conclusion

This study provides a robust framework for examining the brain-like functional organization within LLMs, furthering our understanding of how these models emulate neural architectures. While more advanced LLMs achieve a better balance between computational diversity and functional specialization, future research should address limitations such as fixed dictionary sizes and broader fMRI session evaluations to improve the mapping precision between AN sub-groups and FBNs. The findings offer a pathway for developing brain-inspired AI models, potentially advancing the pursuit of artificial general intelligence (AGI) based on human brain principles.

Markdown Report Issue

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Explain it Like I'm 14

Brain-like Functional Organization within LLMs — A Simple Guide

1. What is this paper about?

This paper explores a big question: Do LLMs like BERT and Llama organize their “artificial neurons” in a way that’s similar to how the human brain organizes itself into networks for different jobs (like understanding language or seeing images)? The authors compare patterns inside LLMs to patterns in people’s brains while listening to a story.

2. What questions were the researchers asking?

The researchers set out to find answers to a few simple questions:

Do groups of artificial neurons in LLMs act like teams in the human brain (called functional brain networks) that handle specific tasks?
Can we link specific groups of LLM neurons to specific brain networks?
As LLMs get more advanced (from BERT to Llama 1–3), do they become more “brain-like” in how they organize their functions?
Which brain networks are most involved when people listen to stories, and do similar patterns appear in LLMs reading the same story?

3. How did they study it? (Methods in everyday language)

Here’s the basic idea: The team showed the same story to both people and LLMs, then compared how each “responded” over time.

People: Volunteers listened to an audio story inside an fMRI scanner. fMRI is like a camera that takes pictures of brain activity by measuring blood flow. Each tiny 3D unit the scanner sees is called a voxel (think of it as a 3D pixel).
LLMs: The same story (as text) was fed into LLMs (BERT, Llama 1, Llama 2, and Llama 3). Inside these models are many “artificial neurons” that light up in different ways as they read the words.

To fairly compare the two:

Timing: The brain scanner takes snapshots every few seconds, but words come faster. So the researchers averaged the model’s responses across the words that happened during each brain snapshot. They also adjusted for the delay in blood flow (a standard step called the hemodynamic response).
Too many neurons? Summarize them: LLMs have thousands of artificial neurons, which is a lot to handle. So the researchers used a technique called “dictionary learning.” You can think of this like finding a small playlist of common “rhythms” or patterns that can be mixed to recreate many different songs. Here:
- Each “atom” in the dictionary is one typical time-pattern.
- “Sparse” means the model tries to explain each neuron’s behavior using only a few of these patterns, not all of them at once.
Linking to the brain: They then asked, “Can these same patterns from the LLMs predict what the brain’s voxels are doing?” They built simple predictive models to see which patterns best matched activity in different brain areas. This produced “brain maps” showing where each pattern fits in the brain.
Naming the brain networks: A tool matched each brain map to known brain networks (like language, visual, attention, and default mode networks). That way, the team could say which LLM patterns corresponded to which brain networks.

Key details:

Data: 59 people listening to one story (“Shapes”) from the Narratives fMRI dataset.
Models: BERT and Llama 1–3.
Patterns: They summarized LLM neuron activity into 64 common time-patterns.

4. What did they find?

Here are the main takeaways:

The LLM patterns predicted brain activity well. Overall, the Llama models matched brain data better than BERT, which makes sense since Llama models are stronger LLMs.
The brain areas that matched best included:
- Auditory cortex (hearing the story),
- Language areas,
- Visual areas,
- Attention networks,
- Frontoparietal and salience networks (help with control and importance),
- The default mode network (DMN), which helps connect new information to what you already know and think about big-picture meaning.
Brain networks worked together and sometimes in opposition. Many brain maps showed multiple networks being active at once, or some turning up while others turned down. For example, sometimes visual areas increased while language areas decreased, or vice versa. This shows the brain—and the LLM patterns—coordinate across different teams, not just one at a time.
As LLMs got more advanced, their organization looked more “brain-like.”
- In Llama 3, multiple patterns that were labeled with the same brain networks had more similar time-behavior, suggesting a cleaner, more consistent organization.
- Groups of artificial neurons tied to the same pattern were organized more consistently across the model’s layers and tended to appear more in deeper layers. This hints that deeper parts of the model may handle higher-level, more abstract processing—similar to how deeper brain regions handle more complex meaning.
- Overall, Llama 3 seemed to strike a better balance: it kept useful diversity (different patterns for different needs) while also showing consistent specialization (reliable patterns for particular functions).

Why this matters scientifically:

It supports the idea that strong AI models may develop internal structures that resemble how the brain organizes functions.
It highlights the role of both specialized (language, auditory) and general-purpose (attention, control, DMN) brain networks during story understanding—something prior neuroscience has also shown.

5. Why does this matter? (Implications and impact)

For AI: This work suggests we can design and interpret AI using ideas from the brain. If better-performing LLMs show clearer brain-like organization, then aiming for brain-inspired structures could help us build smarter, more understandable AI.
For understanding models: Linking LLM neuron groups to specific brain networks makes these models less “black-box.” It gives clues about which parts do what, and how different parts work together.
For neuroscience: If LLMs mirror brain network patterns during language, they could be used as tools to explore human cognition—especially how multiple brain networks cooperate and compete during complex tasks like story understanding.
Looking ahead: The authors note some limits—they only used one story session, and they fixed the number of patterns (64) for all models. Testing more data and tuning the number of patterns per model could reveal even clearer connections. Applying this approach to vision or audio models could show whether brain-like organization is a general property of today’s advanced AI.

In short: As LLMs get smarter, their inner workings seem to organize more like the human brain—specialized yet coordinated—offering a path toward AI that learns and thinks in more human-like ways.

View Paper Prompt View All Prompts

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Brain-like Functional Organization within Large Language Models

Summary

Brain-like Functional Organization within LLMs

Introduction

Methodology

Results

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Brain-like Functional Organization within LLMs — A Simple Guide

1. What is this paper about?

2. What questions were the researchers asking?

3. How did they study it? (Methods in everyday language)

4. What did they find?

5. Why does this matter? (Implications and impact)

Open Problems

Continue Learning

Collections

Tweets