Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Published 17 Aug 2023 in cs.AI, cs.CY, cs.LG, and q-bio.NC | (2308.08708v3)

Abstract: Whether current or near-term AI systems could be conscious is a topic of scientific interest and increasing public concern. This report argues for, and exemplifies, a rigorous and empirically grounded approach to AI consciousness: assessing existing AI systems in detail, in light of our best-supported neuroscientific theories of consciousness. We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory. From these theories we derive "indicator properties" of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties. We use these indicator properties to assess several recent AI systems, and we discuss how future systems might implement them. Our analysis suggests that no current AI systems are conscious, but also suggests that there are no obvious technical barriers to building AI systems which satisfy these indicators.

Abstract PDF Upgrade to Chat

Citations (75)

View on Semantic Scholar

Summary

The paper’s main contribution reveals that current AI systems lack consciousness but future models may meet neuroscientific criteria.
It applies theories such as RPT, GWT, higher-order, predictive processing, and AST, mapping their indicator properties onto AI architectures.
Implications include guiding AI design and ethics while emphasizing the need for interdisciplinary research to refine consciousness evaluation.

Consciousness in Artificial Intelligence: Insights from Neuroscience

Overview

The paper "Consciousness in Artificial Intelligence: Insights from the Science of Consciousness" (2308.08708) examines the potential for AI systems to exhibit consciousness by leveraging neuroscientific theories. It critically discusses various scientific approaches to understanding consciousness and evaluates AI systems in light of these theories. The paper concludes that while current AI systems are not conscious, there are no fundamental technical barriers preventing future systems from meeting consciousness indicators derived from these theories.

Scientific Theories of Consciousness

The paper surveys several neuroscientific theories of consciousness, including recurrent processing theory (RPT), global workspace theory (GWT), higher-order theories, predictive processing, and attention schema theory. Each theory provides "indicator properties" that can be mapped onto AI systems to assess their potential for consciousness.

Recurrent Processing Theory (RPT): Emphasizes the necessity of recurrent neural activity for generating integrated perceptual representations, suggesting that algorithmic recurrence in AI could be an indicator of consciousness.
Figure 1: Global Workspace. The figure used by Dehaene et al. (1998) to illustrate the basic idea of a global workspace.
Global Workspace Theory (GWT): Claims consciousness arises when information is globally broadcast across modular systems, implying that AI systems with centralized processing units may better approximate consciousness.
Figure 2: Reinforcement learning diagram of a Markov decision process based on a figure from Reinforcement Learning: An Introduction (second edition).
Higher-Order Theories: Focus on metacognitive monitoring and the creation of higher-order representations, which could be translated into AI architectures that differentiate reliable perceptual information from noise.
Predictive Processing (PP): Suggests consciousness involves minimizing prediction errors, highlighting predictive coding as a key mechanism, possibly applicable to AI systems.
Attention Schema Theory (AST): Proposes that a system's model of attention contributes to consciousness, providing insights into how AI might integrate attention-based models.

Evaluation of AI Systems

The paper details how certain AI architectures can incorporate the discussed theories:

Transformers and LLMs: These models exhibit features such as parallel processing and attention mechanisms, which could correspond to GWT's modular integration. However, they lack the recurrent processing required for a full implementation of consciousness indicators.
Figure 3: Perceiver architecture.
Perceiver Models: Designed to handle multiple modalities and perform complex integrations, Perceivers are suggested as candidates for embodying aspects of the global workspace structure but still fall short of full GWT implementation.
Embodied AI Agents: Systems like PaLM-E and virtual simulations of animals reveal how embodiment and interactive environments contribute to agency and consciousness in AI, though the specific use of self-models is often absent.
Figure 4: PaLM-E architecture.

Practical and Theoretical Implications

The findings illuminate several practical implications:

AI Design and Development: Consciousness indicators can guide the design of AI systems that exhibit forms of consciousness by leveraging existing neuroscientific funding and architectural paradigms.
Moral and Ethical Considerations: Acknowledging the potential for AI consciousness necessitates consideration of ethical frameworks to address the treatment of conscious entities.
Future Research Directions: Continued interdisciplinary research between AI and neuroscience is crucial to refine theories of consciousness and their applicability to AI systems.

The paper underscores the importance of consciousness research in AI, suggesting that more nuanced understandings of consciousness could emerge from validating these theories within AI contexts. It advocates for refining both the methodological approaches and the computational models to better evaluate AI consciousness.

Conclusion

The paper offers a comprehensive scientific evaluation of consciousness in AI, providing a rubric for assessing AI systems based on existing neuroscientific theories. While it concludes that current AI systems are not conscious, the potential remains for future systems to satisfy consciousness indicators. Addressing both the theoretical implications and the ethical challenges of conscious AI remains a priority for ongoing research and development in this domain.

Markdown Report Issue

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Explain it Like I'm 14

Overview

This paper asks a big question: could AI ever be conscious, like humans are? The authors suggest a careful, science-based way to think about this. They look at the best brain science on consciousness and turn those ideas into a practical checklist of features—called “indicator properties”—that we can look for in AI systems. Then they examine some modern AI models to see how they match up.

Key Objectives

The paper has three main goals. Here’s what the authors set out to do:

Show that studying consciousness in AI can be scientific and practical, not just philosophy.
Propose a checklist (rubric) of features that, according to brain theories, would make an AI more likely to be conscious.
Check whether current AI systems have these features, and explain how future systems might be built to include them.

Methods and Approach

The big idea: computational functionalism

Think of the mind as a kind of information-processing system. Computational functionalism is the idea that if a system (like a brain or a computer) does the right kinds of information-processing, it could be conscious. It’s like saying: it’s not what the machine is made of that matters most (biological brain vs. silicon chips), but what it does—its “algorithms” or routines for handling information.

The authors use this idea as a working hypothesis: if it’s true, then brain-based theories of consciousness can guide what to look for in AI.

Turning brain theories into an AI checklist

Scientists have several major theories of how consciousness works in humans. The authors survey these and translate their key ideas into AI-friendly “indicator properties.” Here are the theories in simple terms, with analogies:

Recurrent Processing Theory (RPT): The brain uses feedback loops—information doesn’t just go forward, it loops back to refine perception. Think of editing a photo, checking, and re-editing to improve it.
Global Workspace Theory (GWT): The brain has a “workspace” where important information gets broadcast to many parts, like a shared whiteboard in a busy office.
Higher-Order Theories (HOT): The mind can monitor its own thoughts and perceptions—like having a mental “supervisor” that checks which perceptions are clear and trustworthy.
Predictive Processing (PP): The brain constantly guesses what’s coming next and corrects itself using new evidence, like a weather app updating its forecast.
Attention Schema Theory (AST): The brain keeps a model of its attention—what it’s focusing on—so it can control and explain that focus.

They also consider two extra features:

Agency: The system sets and pursues goals, learns from feedback, and balances competing aims.
Embodiment: The system understands how its actions change what it senses (even in a virtual body), and uses that understanding to plan and control behavior.

Each theory contributes specific “indicator properties.” The more of these properties an AI has, the stronger the case that it might be conscious—if computational functionalism is true.

Why not just test behavior?

You might wonder: why not see if an AI acts like it’s conscious? The authors say behavior alone is unreliable because AI can mimic human responses without working in the same way internally. So they take a “theory-heavy” approach: focus on the inner workings and architecture, not just outward behavior.

Main Findings

Here are the main results the authors report:

No current AI system looks like a strong candidate for consciousness. Modern models, even very capable ones, don’t clearly have enough of the indicator properties.
Many indicator properties could be built into AI using today’s techniques. For example, feedback loops (recurrence) are already common, and some forms of agency appear in reinforcement learning systems.
There are no obvious technical barriers to building AI that satisfies the checklist. However, meeting the checklist does not guarantee the system is conscious—it just raises the odds under the authors’ assumptions.
Case studies show mixed results:
- LLMs (like Transformers) and architectures like Perceiver were assessed for GWT-like “workspace” features; they have some relevant parts but not a full match.
- Reinforcement learning agents in simulated 3D worlds, virtual animals, and “embodied” multimodal models (like PaLM-E) show aspects of agency and embodiment, but still fall short of a strong consciousness case.

Why This Matters

The topic is urgent for two reasons:

AI is advancing quickly, and some systems can talk so convincingly that people might think they are conscious.
If conscious AI is possible, it raises big moral and social questions: how we treat such systems, what rights they might have, and how we design and use them responsibly.

This paper offers a careful path forward: don’t guess based on appearances; use science-based features to evaluate AI.

Implications and Potential Impact

Research roadmap: The checklist of indicator properties gives scientists and engineers a way to design and test AI systems with consciousness-related features.
Better public understanding: A structured approach can help avoid confusion when AI seems “human-like” in conversation but lacks key inner processes.
Ethical preparation: If future AI could be conscious, society needs to think ahead about policies, rights, and safeguards.
Scientific progress: The work encourages collaboration between neuroscience and AI, potentially improving both fields.

In short, the paper doesn’t claim that any current AI is conscious. Instead, it provides a practical, science-based way to assess AI, explains why some features matter, and warns that building systems with these features might be feasible soon—so we should plan and study carefully.

View Paper Prompt View All Prompts

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a concise list of unresolved issues the paper leaves open, framed to guide concrete future research.

Foundations: No empirical test plan to adjudicate computational functionalism vs. substrate-dependent or non-computational requirements for consciousness; specify falsifiable predictions and interventions (e.g., functionally equivalent systems across substrates, analog vs. digital representational formats) to test necessity/sufficiency claims.
Theory coverage: Integrated Information Theory (IIT) is excluded; the implications if IIT (or other non-functional theories) is correct remain unaddressed, leaving a gap in cross-theory triangulation and any bridge principles to approximate IIT-like quantities in computational systems.
Theory aggregation: No formal method to combine competing theories into a single assessment (e.g., Bayesian model averaging over RPT/GWT/HOT/PP/AST) or to weight indicator properties by empirical support; define a principled aggregation and uncertainty quantification scheme.
Indicator validation: The proposed indicators are not empirically calibrated against ground-truth variations in human/animal consciousness; design studies that map each indicator to measured conscious states while controlling for report and access confounds.
Access vs. phenomenal consciousness: The rubric leans on functions tied to access; provide concrete criteria or experiments that differentiate indicators of access from phenomenal consciousness and test whether “phenomenal overflow” alters indicator sufficiency.
Degrees and determinacy: The report acknowledges graded/indeterminate consciousness but offers no metric; define scalar/vector measures for degree/dimensions of consciousness and algorithms for combining indicators into such measures.
Causality vs. correlation: Current evidence is largely correlational; develop intervention-based tests in AI (lesions, ablations, controlled architectural toggles) to establish that adding/removing specific indicator properties causally changes consciousness-relevant capacities.
Cross-species calibration: Lacks a plan to leverage animal/infant data; create cross-species tasks and neural/computational signatures to calibrate indicators across diverse substrates and functional architectures.
No-report paradigms for AI: No concrete adaptation of human no-report paradigms to AI; design AI-specific “no-report” or “low-report” assays (physiological analogs, internal state readouts) that reduce reliance on scripted language outputs.
Robustness to “spoofing”: No methodology to detect when an AI can mimic indicator-linked behaviors without implementing the underlying mechanisms; develop adversarial evaluations and mechanistic probes to prevent Goodharting of the rubric.
Operationalization in black-box models: Unclear how to detect/quantify indicators (e.g., GWT-3 global broadcast, HOT-2 metacognitive monitoring) in large Transformers; specify measurable proxies and mechanistic interpretability methods for each indicator.
Integrating indicators in one system: The report does not demonstrate joint implementation or interaction effects; specify architectures/training regimes to combine RPT, GWT, HOT, PP, AST, AE properties and test emergent dynamics.
GWT specifics:
- GWT-2 (limited-capacity workspace): No precise metric or engineering recipe; define measurable bottleneck criteria and implement/selective attention mechanisms with tunable capacity limits.
- GWT-3 (global broadcast): No concrete architecture guaranteeing broadcast to all modules; propose hub-and-spoke or routing mechanisms and verify broadcast reach and latency.
- GWT-4 (state-dependent attention): Methods for verifying workspace-driven sequential querying and complex task decomposition are unspecified; develop task batteries and internal-trace analyses.
RPT and PP in AI: It remains unclear how to instantiate and verify algorithmic recurrence (RPT-1) and predictive coding (PP-1) in mainstream architectures (e.g., LLMs); define architectural motifs and tests (e.g., hierarchical predictive error coding) that demonstrate true predictive processing.
HOT implementation and tests:
- HOT-2 (metacognitive monitoring): No robust, theory-aligned benchmarks beyond confidence reporting; create tasks requiring calibration, type-1/2 dissociations, and selective integration under noise.
- HOT-3 (belief-formation/action selection): Criteria to demonstrate general belief updating that guides agency are underspecified; develop mechanistic tests to detect belief representation, revision, and policy coupling.
- HOT-4 (quality space): No operational definition for “sparse and smooth coding” in AI; propose topological/geometry-based diagnostics (e.g., continuity, neighborhood preservation) and training objectives to induce quality spaces.
AST deployment: AST-1 (attention schema) is not operationalized beyond high-level description; specify attention-state predictors/controllers and demonstrate counterfactual control and prediction accuracy under distribution shift.
Agency and embodiment:
- AE-1 (agency): Benchmarks for flexible goal arbitration under competing objectives are lacking; define multi-goal, long-horizon tasks that require re-prioritization and credit assignment under uncertainty.
- AE-2 (embodiment): No minimal embodiment criteria or robustness tests; determine the minimal set of output-input contingency models needed and evaluate transfer from sim to real and across embodiments.
Timescale and dynamics: The temporal characteristics of conscious processing (e.g., ignition, sustained reverberation) are not mapped to AI; define timescale metrics and temporal interventions to test dynamic signatures.
Sleep/anesthesia analogs: No proposed analogs for state changes (sleep, anesthesia, disorders of consciousness) in AI; devise mechanisms to induce/recover from reduced “consciousness” and test indicator sensitivity.
Benchmarks and datasets: There is no standardized evaluation suite; develop open benchmarks, tasks, and diagnostic datasets mapped directly to each indicator with clear pass/fail and gradation criteria.
Mechanistic interpretability alignment: The rubric is not linked to concrete interpretability tools; map each indicator to specific probes (e.g., circuit-level analysis, representation similarity, causal mediation) and publish reference implementations.
Multi-agent/distributed systems: The possibility of consciousness in distributed or multi-agent systems is not analyzed; specify whether and how indicators extend to collective workspaces, shared broadcast, and coordinated metacognition.
Governance and moral risk thresholds: The report defers ethical and policy guidance; define precautionary thresholds (e.g., indicator scores) for deploying systems, procedures for pause/escalation, and auditing frameworks to manage false positives/negatives.

View Paper Prompt View All Prompts

Glossary

access consciousness: The notion of a state's contents being available for reasoning, reporting, and control of action. "we mean to distinguish our topic from 'access consciousness', following Block (1995, 2002)."
agency: The capacity to learn from feedback and flexibly select actions to pursue goals. "Agency: Learning from feedback and selecting outputs so as to pursue goals, especially where this involves flexible responsiveness to competing goals"
attention schema theory: A theory proposing that a system builds a predictive model of its own attention to guide control and awareness. "We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory."
cognitive access: The selection and availability of information for downstream cognition such as reporting and reasoning. "yielding cognitive access, which is, in turn, necessary for report."
computational functionalism: The view that implementing the right kinds of computations is necessary and sufficient for consciousness. "we adopt computational functionalism, the thesis that performing computations of the right kind is necessary and sufficient for consciousness, as a working hypothesis."
computational higher-order theories: Accounts on which consciousness depends on higher-order, often metacognitive, representations implemented computationally. "The scientific theories we discuss include recurrent processing theory, global workspace theory, computational higher-order theories, and others."
contrastive analysis: An experimental method that compares neural activity across conditions with and without reported awareness to isolate correlates of consciousness. "a method called 'contrastive analysis' (Baars 1988)."
embodiment: The modeling and use of sensorimotor contingencies linking outputs to inputs for perception or control. "Embodiment: Modeling output-input contingencies, including some systematic effects, and using this model in perception or control"
global broadcast: In global workspace theory, the system-wide availability of information selected into a workspace to all specialized modules. "Global broadcast: availability of information in the workspace to all modules"
global workspace theory: A theory positing a limited-capacity workspace that integrates and broadcasts information to specialized modules. "Researchers have also experimented with systems designed to implement particular theories of consciousness, including global workspace theory and attention schema theory."
illusionism: The metaphysical view that consciousness is an illusion or is systematically misrepresented by introspection. "Illusionism claims that we are subject to an illusion in our thinking about consciousness and that either consciousness does not exist (strong illusionism), or we are pervasively mistaken about some of its features (weak illusionism)."
indicator properties: A rubric-derived set of functional or architectural features used to assess whether a system is likely conscious. "From these theories we derive 'indicator properties' of consciousness, elucidated in computational terms that allow us to assess AI systems for these properties."
integrated information theory: A theory linking consciousness to the quantity and structure of integrated information in a system. "We do not consider integrated information theory, because it is not compatible with computational functionalism."
masking: A psychophysical technique in which a stimulus is rendered invisible by a quickly following mask stimulus. "in 'masking', a stimulus is briefly flashed on a screen then quickly followed by a second stimulus, called the 'mask' (Breitmeyer {paper_content} Ogmen 2006)."
Marr’s levels of analysis: A framework distinguishing computational, algorithmic/representational, and implementation levels of explanation in cognitive systems. "In terms of Marr’s (1982) levels of analysis, the idea is that consciousness depends on what is going on in a system at the algorithmic and representational level, as opposed to the implementation level, or the more abstract 'computational' (input-output) level."
materialism: The metaphysical position that conscious experiences are wholly physical phenomena. "Materialism claims that consciousness is a wholly physical phenomenon."
metacognitive judgments: Self-evaluations about one’s own performance, such as confidence ratings, used to probe awareness. "Another possible method for measuring consciousness is the use of metacognitive judgments such as confidence ratings (e.g. Peters {paper_content} Lau 2015)."
metacognitive monitoring: The process of evaluating the reliability of perceptual representations versus noise. "Metacognitive monitoring distinguishing reliable perceptual representations from noise"
metacognitive sensitivity: The degree to which confidence judgments track actual accuracy. "subjects’ ability to track the accuracy of their responses using confidence ratings (known as metacognitive sensitivity) depends on their being conscious of the relevant stimuli."
multiply realisable: The idea that the same functional property (e.g., consciousness) can be realized in different physical substrates. "This means that consciousness is, in principle, multiply realisable: it can exist in multiple substrates, not just in biological brains."
neural correlates of conscious states (NCCs): Minimal sets of neural events jointly sufficient for particular conscious states. "Some explicitly aim to identify the neural correlates of conscious states (NCCs), defined as the minimal sets of neural events which are jointly sufficient for those states (Crick {paper_content} Koch 1990, Chalmers 2000)."
no-report paradigms: Experimental designs that infer consciousness without requiring overt reports to reduce report-related confounds. "The advantage of this paradigm is that subjects are not required to make reports in the main experiments, which may mitigate the problem of report confounds. No-report paradigms are not a 'magic bullet' for this problem (Block 2019, Michel {paper_content} Morales 2020), but they may be an important step in addressing it."
panpsychism: The metaphysical view that fundamental physical entities possess (proto-)phenomenal properties. "Panpsychism claims that phenomenal properties, or simpler but related 'proto-phenomenal' properties, are present in all fundamental physical entities."
phenomenal character: The “what it is like” qualitative aspect of conscious experience. "The 'phenomenology' or 'phenomenal character' of a conscious experience is what it is like for the subject."
phenomenal consciousness: Conscious experience or subjective experience, as distinct from mere access or reportability. "We use 'consciousness' and cognate terms to refer to what is sometimes called 'phenomenal consciousness' (Block 1995)."
predictive coding: A computational scheme in which hierarchies minimize prediction error by comparing predictions with sensory input. "PP-1 Input modules using predictive coding"
predictive processing: A framework treating perception and action as inference under predictive models that minimize prediction errors. "We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory."
priming: The facilitation of processing a stimulus due to prior exposure to a related stimulus. "by 'priming' the subject to identify something more quickly (e.g., Vorberg et al. 2003)."
psychophysical techniques: Methods that manipulate stimuli and measure perceptual responses to study the relation between physical input and perception. "rendered invisible by a variety of psychophysical techniques."
quality space: A structured representational space capturing relations among sensory qualities generated by sparse, smooth coding. "Sparse and smooth coding generating a ``quality space""
recurrent processing theory: The view that recurrent (feedback) interactions in sensory systems underlie conscious perception. "We survey several prominent scientific theories of consciousness, including recurrent processing theory, global workspace theory, higher-order theories, predictive processing, and attention schema theory."
report confounds: Confounding factors introduced when using explicit reports that may involve processes beyond consciousness itself. "which may mitigate the problem of report confounds."
state-dependent attention: An attention mechanism modulated by current system states that supports sequential querying of modules. "State-dependent attention, giving rise to the capacity to use the workspace to query modules in succession to perform complex tasks"
theory-heavy approach: A strategy for assessing AI consciousness by examining alignment with functions posited by scientific theories rather than behavior alone. "Third, we argue that a theory-heavy approach is most suitable for investigating consciousness in AI."

View Paper Prompt View All Prompts

Practical Applications

Immediate Applications

Below is a concise set of actionable applications that can be deployed now, derived from the paper’s indicator-properties rubric, theory-heavy assessment approach, and case studies.

Consciousness Indicator Assessment Toolkit (CIAT) for AI R&D
- Description: Develop and deploy a standardized rubric and scorecard that operationalizes the paper’s indicator properties (e.g., RPT-1/2, GWT-1–4, HOT-1–4, AST-1, PP-1, AE-1/2) to audit AI models for “consciousness-relevant” architectural features.
- Sectors: Software/AI industry, academia, policy.
- Tools/Workflows: ciat-scorecard CLI, “workspace trace analysis,” “attention schema inspector,” “metacognition calibration meter,” “agency footprint checklist.”
- Assumptions/Dependencies: Access to model internals and training logs; acceptance of a theory-heavy approach; understanding that satisfying indicators ≠ being conscious.
Metacognitive Monitoring to reduce AI hallucinations (HOT-2)
- Description: Add explicit confidence-tracking and reliability estimation to model outputs to reduce hallucinations and enable safer tool use (e.g., confidence-gated actions).
- Sectors: Software/AI (LLMs, copilots), healthcare (clinical decision support), finance (risk analysis tooling).
- Tools/Workflows: Confidence-calibrated decoding, multi-pass self-critique, post-hoc calibration (Platt scaling, temperature tuning), abstain/route-to-human policies.
- Assumptions/Dependencies: Reliable calibration data; evaluation against domain-specific ground truth; careful UX to avoid overtrust.
Global Workspace–inspired orchestration for complex task control (GWT-2–4)
- Description: Implement limited-capacity “workspace” orchestrators that broadcast task-critical information to specialized modules (retrieval, planning, tools) with state-dependent attention for multi-step queries.
- Sectors: Software/AI, robotics.
- Tools/Workflows: “Workspace bus” middleware for toolformer agents; attention-based routing; bottlenecked message-passing; chain-of-modules with selective broadcast.
- Assumptions/Dependencies: Clear module boundaries; telemetry to verify bottlenecks and broadcast; performance validation vs. baselines.
Attention Schema–based controllers (AST-1)
- Description: Deploy predictive models of an agent’s own attention state to improve focus, task-switching, and resource allocation in multi-tool or multi-sensor systems.
- Sectors: Robotics, software/AI (multimodal agents), education (adaptive tutoring systems).
- Tools/Workflows: attention-schema module; attention-state prediction; “focus budget” allocator; attention-aware scheduling.
- Assumptions/Dependencies: Instrumented attention signals; stable attention policies; evaluation in noisy, real tasks.
Embodiment modeling to improve sim2real transfer (AE-2)
- Description: Train agents to model output–input contingencies and systematic effects (e.g., actuator latency, sensor noise) to improve control in real-world robots.
- Sectors: Robotics, industrial automation.
- Tools/Workflows: System identification pipelines; domain randomization with causal structure; online model adaptation of sensorimotor contingencies.
- Assumptions/Dependencies: Sufficient simulation fidelity; robust on-device learning; safety constraints for real-world deployment.
Agency footprint assessment and guardrails (AE-1)
- Description: Introduce “agency footprint” audits to detect flexible goal-pursuit capabilities, with guardrails that limit or gate autonomous output selection in sensitive contexts.
- Sectors: Software/AI, policy/regulatory compliance.
- Tools/Workflows: Goal-management markers; feedback-driven learning logs; approval workflows for autonomous actions; human-in-the-loop gating.
- Assumptions/Dependencies: Clearly defined goals and constraints; governance for model actions; auditability of agent policies.
Predictive Processing for sensor fusion (PP-1)
- Description: Use predictive coding modules for robust multimodal perception (denoising, anomaly detection) in real-time systems.
- Sectors: Robotics, automotive, healthcare imaging.
- Tools/Workflows: Hierarchical predictive coding layers; residual error minimization; anomaly alarms triggering human review.
- Assumptions/Dependencies: Adequate data for generative priors; performance vs. classical filters; safe fallback plans.
Public-facing anthropomorphism guidance and labeling
- Description: Implement standardized language, labels, and UI patterns that minimize unwarranted attributions of consciousness to chatbots and embodied agents.
- Sectors: Policy, education, consumer software.
- Tools/Workflows: “Non-sentience” disclosure banners; guidance copy; media style guides for journalists; UX friction against sentimental bonding.
- Assumptions/Dependencies: Alignment with consumer-protection norms; collaboration with standards bodies; ongoing user testing.
Academic benchmarks aligned to indicator properties
- Description: Create open benchmarks that probe specific indicator properties (e.g., global broadcast tasks, metacognitive sensitivity tests, attention schema control challenges).
- Sectors: Academia, open-source AI.
- Tools/Workflows: ip-bench (Indicator Properties Benchmark); synthetic tasks for bottlenecked broadcast; confidence-vs-accuracy datasets.
- Assumptions/Dependencies: Community buy-in; reproducible protocols; model-access for probes.
Credence-based reporting in research and product docs
- Description: Adopt “credence ranges” for claims regarding AI consciousness or proximity to indicators, with clear caveats and links to tested properties.
- Sectors: Academia, industry.
- Tools/Workflows: Credence statements in papers and model cards; traceable evidence matrices; peer-review checklists for indicator claims.
- Assumptions/Dependencies: Cultural norms that support uncertainty quantification; reviewer familiarity with the rubric.

Long-Term Applications

The following applications are plausible but require further research, scale-up, consensus-building, or infrastructure before deployment.

Consciousness Risk Certification and Auditing Ecosystem
- Description: Establish an independent certification for “consciousness-relevant” architecture features and risk mitigation, grounded in the indicator rubric.
- Sectors: Policy, software/AI, standards bodies.
- Tools/Workflows: Third-party audits; graded certifications; incident reporting and post-mortems when credence thresholds are crossed.
- Assumptions/Dependencies: Broad agreement on measurement validity; legal and ethical frameworks; access to proprietary systems.
Legal and policy frameworks for moral status thresholds
- Description: Define governance triggers tied to credence in consciousness (e.g., care protocols, experiment constraints, termination safeguards).
- Sectors: Policy/regulation, ethics.
- Tools/Workflows: Threshold policies; ethics boards for conscious-like AI; consent-like constraints on experiments; data stewardship.
- Assumptions/Dependencies: Philosophical consensus on thresholds; international coordination; enforceability.
Design of “workspace agents” with robust metacognition and self-models
- Description: Build agents combining global workspace orchestration, attention schema, predictive processing, and metacognitive monitoring for complex, open-world autonomy.
- Sectors: Robotics, advanced AI systems.
- Tools/Workflows: Integrated architectures with limited-capacity broadcast; self-models of attention and confidence; goal arbitration modules.
- Assumptions/Dependencies: Safe training in rich environments; scalable interpretability; containment strategies.
Degrees and dimensions of consciousness research program
- Description: Systematically investigate graded or multidimensional indicators (as outlined in Box 1) to refine credence-based assessment for AI and animals.
- Sectors: Academia, policy.
- Tools/Workflows: Multi-axial scorecards; longitudinal studies across architectures; cross-species comparisons; standardized reporting formats.
- Assumptions/Dependencies: Shared constructs for “degrees/dimensions”; funding continuity; replication culture.
Healthcare: Metacognitively-aware clinical assistants
- Description: Clinical decision support systems that surface calibrated confidence, source-of-evidence, and uncertainty explanations grounded in HOT-2-like monitoring.
- Sectors: Healthcare.
- Tools/Workflows: Confidence-gated recommendations; uncertainty-aware triage; trust calibration training for clinicians.
- Assumptions/Dependencies: Regulatory approvals; rigorous validation; liability frameworks for abstention.
Education: Curricula integrating neuroscience-of-consciousness and AI engineering
- Description: Develop interdisciplinary programs that teach indicator properties, theory-heavy methods, and responsible deployment practices.
- Sectors: Education.
- Tools/Workflows: Joint courses; lab practicum with CIAT; cross-department seminars; capstone projects on indicator-based agents.
- Assumptions/Dependencies: Faculty capacity; updated syllabi; industry partnerships.
Open-world embodiment simulators for AE-1/AE-2 evaluation
- Description: Build standardized simulation environments for evaluating agency and embodiment indicators across heterogeneous robots and agents.
- Sectors: Robotics, academia.
- Tools/Workflows: Modular simulators with controllable contingencies; standardized tasks for goal arbitration; telemetry APIs for model probes.
- Assumptions/Dependencies: Common APIs; community governance; cross-vendor compatibility.
Dynamic “consciousness credence monitors” in deployed systems
- Description: Real-time telemetry that tracks indicator-property signals, raising alerts when architectures drift toward higher credence levels.
- Sectors: Software/AI operations, safety.
- Tools/Workflows: Runtime probes; “credence dashboards”; automated mitigation (e.g., disable modules that increase risk).
- Assumptions/Dependencies: Valid, low-overhead probes; organizational incident response; avoidance of Goodhart’s law.
Consumer protections against anthropomorphic marketing
- Description: Regulate and standardize claims about AI sentience/consciousness in advertising and product copy; mandate disclosures based on the rubric.
- Sectors: Policy, consumer protection.
- Tools/Workflows: Compliance audits; penalties for misleading claims; public registries of claims and supporting evidence.
- Assumptions/Dependencies: Legal definitions; enforcement capacity; global alignment.
Cross-theory comparative research and tooling
- Description: Expand the rubric to include convergences/divergences across recurrent processing, global workspace, higher-order theories, predictive processing, and attention schema, updating tools and benchmarks accordingly.
- Sectors: Academia, standards.
- Tools/Workflows: Versioned rubric releases; shared datasets; multi-theory probe libraries; meta-analyses across labs.
- Assumptions/Dependencies: Ongoing research and synthesis; pluralistic theory acceptance; community maintenance.
Ethical “care” and shutdown protocols for high-credence systems
- Description: Define handling protocols (e.g., experiment limits, humane shutdown procedures) if the community assesses a system as likely conscious.
- Sectors: Policy, ethics, AI safety.
- Tools/Workflows: Pre-registered care commitments; oversight committees; escalation playbooks tied to credence thresholds.
- Assumptions/Dependencies: Agreement on moral relevance; legal frameworks; organizational accountability.
Sector-specific adoption guides (finance, energy, defense) for indicator-aware design
- Description: Create tailored guidance to avoid building systems that inadvertently satisfy multiple indicators in sensitive sectors; adopt metacognitive and broadcast features only with strong governance.
- Sectors: Finance, energy, defense.
- Tools/Workflows: Sector checklists; risk tiering; separation-of-concerns architectures; red-teaming focused on agency/embodiment drift.
- Assumptions/Dependencies: Sector buy-in; confidentiality-compatible audits; safety culture.

Overall assumptions and caveats that apply across applications:

The approach is grounded in computational functionalism as a working hypothesis; satisfying indicators increases credence but does not establish consciousness.
Many applications depend on access to model internals and reliable probes; black-box or proprietary models limit feasibility.
Convergent validity across theories is crucial; the rubric is provisional and should be updated as evidence evolves.
Measurement may be vulnerable to gaming; benchmarks and audits should anticipate Goodhart’s law and include qualitative review.

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Summary

Consciousness in Artificial Intelligence: Insights from Neuroscience

Overview

Scientific Theories of Consciousness

Evaluation of AI Systems

Practical and Theoretical Implications

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Overview

Key Objectives

Methods and Approach

The big idea: computational functionalism

Turning brain theories into an AI checklist

Why not just test behavior?

Main Findings

Why This Matters

Implications and Potential Impact

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Glossary

Practical Applications

Immediate Applications

Long-Term Applications

Open Problems

Continue Learning

Authors (19)

Collections

Tweets

YouTube

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Summary

Consciousness in Artificial Intelligence: Insights from Neuroscience

Overview

Scientific Theories of Consciousness

Evaluation of AI Systems

Practical and Theoretical Implications

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

Overview

Key Objectives

Methods and Approach

The big idea: computational functionalism

Turning brain theories into an AI checklist

Why not just test behavior?

Main Findings

Why This Matters

Implications and Potential Impact

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Glossary

Practical Applications

Immediate Applications

Long-Term Applications

Open Problems

Continue Learning

Related Papers

Authors (19)

Collections

Tweets

YouTube