FPCA: Field-Programmable Crossbar Array

Updated 18 February 2026

FPCA is a reconfigurable computing fabric that integrates emerging non-volatile memories to perform both digital and analog processing.
It employs a dense crossbar architecture enabling in-place configuration for logic synthesis, arithmetic, and associative search.
FPCA research focuses on optimizing energy efficiency, parallelism, and scalability through advanced device technologies and mapping methodologies.

A Field-Programmable Crossbar Array (FPCA) is a reconfigurable computing substrate composed of a dense, regular crossbar of programmable logic or memory devices, typically leveraging emerging non-volatile memories (e.g., RRAM, memristors, FeFETs, or QCA cells). Unlike conventional von Neumann architectures that strictly separate storage and compute, FPCAs exploit in-place configuration and multi-modal operation within a single array fabric to achieve high parallelism, low energy, and unified memory-compute functionality. These arrays are field-programmable: their connectivity and device states can be dynamically reconfigured in situ to realize arbitrary Boolean logic, in-memory arithmetic, neuromorphic processing, or associative search, enabling both digital and analog computations in a single platform (Zidan et al., 2016, Meihar et al., 2023).

1. Device Technologies and Physical Principles

The FPCA concept supports a range of device technologies, each leveraging different physical mechanisms to enable programmability:

Resistive RAM (RRAM) and ReRAM FPCAs: Each cell is a non-volatile two-terminal resistor with multiple programmable states (e.g., HRS/LRS or multi-level). Binary or multi-level operation is supported, and arrays are fabricated above CMOS logic for dense integration (Zidan et al., 2016, Vasileiadis et al., 5 Feb 2025).
Memristive Threshold Logic FPCAs: Arrays deploy programmable threshold logic gates (TLGs) at each crosspoint, using memristors as pairwise-tunable resistive elements to implement various Boolean functions, e.g., NAND, NOR, XNOR, with weights and thresholds field-programmed and latched in place (Krestinskaya et al., 2018).
Quantum-dot Cellular Automata (QCA) FPCAs: QCA cells utilize four quantum dots per cell, with two mobile electrons tunneling among sites to encode binary logic by charge polarization, driven by adiabatic clocking for signal propagation (Kalogeiton et al., 2016).
Ferroelectric MirrorBit (FeFET) FPCAs: MirrorBit-FeFET devices extend binary FeFETs to four polarization states by imposing a transverse polarization gradient, enabling dense 2-bit storage and diode-like behavior for crossbar operations in both memory and associative computing tasks (Meihar et al., 2023).
Silicon Nitride Memristor FPCAs: SOI integration permits multi-level memristors (12 resistance states) in crossbar configuration, suitable for memristor rationed logic (MRL) and energy-efficient parallel logic evaluation (Vasileiadis et al., 5 Feb 2025).

Each technology comes with device-specific programming protocols (e.g., precise voltage pulses, field gradients), I–V characteristics, and endurance properties, and supports unique forms of logic and memory configurability.

2. Crossbar Architecture and Programmability

An FPCA employs a regular grid of horizontal (word line) and vertical (bit line) conductors, where each crosspoint hosts a programmable device or logic gate. Architectural variants span:

Pure crossbar (1R): Minimal selector elements; sneak-paths are managed by precise device engineering and crossbar segmentation (Vasileiadis et al., 5 Feb 2025).
1S1R/CMOS-hybrid: Each cell paired with an access selector (e.g., transistor or diode) for precise read/write addressing and sneak current suppression (Eshraghian et al., 2021, Meihar et al., 2023).
Stacked (3D) crossbars: Multiple device planes with shared or isolated electrodes, enabling simultaneous multi-plane in-memory compute or pipelined read/write for enhanced throughput (Eshraghian et al., 2021).

Programmability encompasses:

Logic configuration: Crosspoints realize majority (QCA, ReRAM), threshold (memristive TLG), or rationed logic gates (multi-level memristors), with functions specified through external programming lines or memory state initialization (Kalogeiton et al., 2016, Krestinskaya et al., 2018, Vasileiadis et al., 5 Feb 2025).
Arithmetic/analog in-memory compute: By mapping weights or logic functions as device conductances, feeding inputs as voltages, and accumulating output currents on columns, the crossbar natively performs vector-matrix multiplication, popcount, or analog accumulation (Zidan et al., 2016).
Associative/search mode: Some FPCAs support TCAM operations by leveraging programmable diode-like devices (e.g., MirrorBit in NOR configuration), providing field-programmable pattern matching or lookup (Meihar et al., 2023).
Dynamic reconfiguration: Field-programmable resources, such as selectors or control lines, allow on-the-fly partitioning into storage, computation, or analog-accumulate blocks (Zidan et al., 2016, Meihar et al., 2023).

3. Programming Mechanisms and Mapping Methodologies

FPCA functional mapping requires algorithms that account for physical constraints—array size, granularity, device physics, and parallelism:

Logic Synthesis and Gate Mapping: QCA crossbars use majority and inverter crosspoints, mapping arbitrary Boolean networks by decomposing into AND/OR/NOT, assigning each gate to a crosspoint, and using program lines to set constant inputs. Timing (clock-phase assignment) is dynamically iterated to synchronize signal arrivals within clock-zone limits (Kalogeiton et al., 2016).
Technology Mapping for In-Memory Compute: In ReRAM FPCAs (e.g., ReVAMP), mapping flows translate AIG/MIG Boolean networks into LUT or majority-inverter forms, packed onto array words (rows), and produce instruction schedules (Read/Apply) that balance parallelism, area, and device utilization. Both area-focused and delay-optimized mappings are available, leveraging block packing and bin-fit strategies (Bhattacharjee et al., 2018).
Memristive TLG Programming: Two memristors and one control voltage per TLG cell determine logic function; once programmed, the cell functions statically without further memristor writes during normal operation (Krestinskaya et al., 2018).
State and Weight Encoding: Signed weights for neural networks are mapped to non-negative crossbar conductances via Adjacent Connection Matrix (ACM) encoding, creating a periphery matrix S and non-negative device matrix M such that S M = W. This method provides regularization, area/read energy reduction, and variation-robustness over standard double-element encodings (Kazemi et al., 2020).

4. Modes of Computation and Functional Flexibility

FPCAs are distinguished by their multi-modal operation:

Mode	Principle	Example Technologies
Nonvolatile Storage (S mode)	Addressed read/write to store binary or multi-level data	All FPCAs
In-Place Digital Arithmetic	Parallel popcounts, logic or arithmetic via curr. sum	RRAM, QCA, TLG, MRL FPCAs
Analog/Neuromorphic Compute	VMM, dot product with analog input ↔ current sum	RRAM, multi-level memristor FPCA
Associative/TCAM Operation	Pattern match via diode configuration	MirrorBit-FeFET, ReRAM TCAMs

FPCA arrays can dynamically partition tiles to serve these modes, sometimes even within a single computational epoch (Zidan et al., 2016, Meihar et al., 2023). 3D FPCAs (e.g., CrossStack) switch between "expansion" and "deep-net" modes to trade off vector width versus pipeline throughput, mitigating IR drop and enhancing computational density (Eshraghian et al., 2021).

5. Performance Metrics and Experimental Results

Depending on device and architecture, reported metrics include:

Area: E.g., QCA 1-bit full adder in 92 cells, 0.087 μm² (Kalogeiton et al., 2016); MirrorBit TCAM cell of 0.156 μm² at 28 nm (Meihar et al., 2023).
Latency: QCA full adder delay ≈0.75 clock cycles; CrossStack achieves 29% speedup per convolution in deep-net mode over conventional 2D (Kalogeiton et al., 2016, Eshraghian et al., 2021).
Energy: Typical per-event energy ranges from fJ (QCA, MirrorBit, CrossStack) to tens of μW per logic gate; energy per MAC for CrossStack ≈200 fJ (Kalogeiton et al., 2016, Meihar et al., 2023, Eshraghian et al., 2021).
Precision and Endurance: For multi-level devices, practical precision is 3–4 bits/cell given σ≈7% (CrossStack) and 12-level operation in SiNx-memristor FPCAs (Eshraghian et al., 2021, Vasileiadis et al., 5 Feb 2025).
Variation Tolerance: ACM encoding improves inference accuracy and regularization under quantization and device variability (Kazemi et al., 2020).
Throughput: FPCA-based BCNNs and SPMV achieve multi-teraops/s in simulated large-scale systems; CrossStack and similar platforms can pipeline operations for throughput scaling (Zidan et al., 2016, Eshraghian et al., 2021).

6. Scalability, Limitations, and Future Directions

Scaling FPCAs faces architectural and device barriers:

Wire/clocking complexity: FPCA logic depth and fanout increase clock-zone and routing complexity, necessitating hierarchical tiling, more clock rails, and automated synthesis tool support (Kalogeiton et al., 2016, Zidan et al., 2016).
Sneak-paths and IR drop: Selector-less architectures must manage crosstalk and voltage attenuation; solutions include array segmentation and access transistors (Vasileiadis et al., 5 Feb 2025, Eshraghian et al., 2021).
Integration limits: Variability in memristor and line resistances, device-to-device mismatch, and endurance may limit large-scale implementation; device redundancy and defect-tolerant routing are proposed mitigations (Vasileiadis et al., 5 Feb 2025, Kalogeiton et al., 2016).
Programming Overhead: Field-programming time can be significant for high-precision, multi-level arrays; in-field tuning and parallel update schemes are used to reduce downtime (Vasileiadis et al., 5 Feb 2025).
Multi-mode optimization: Future FPCAs may further exploit 3D stacking, adaptive clock schemes, or multi-layer vias to boost density and reconfigurability (Kalogeiton et al., 2016, Eshraghian et al., 2021).

Potentially, hierarchical, software-visible tiling, and algorithm-aware mapping strategies will be required to match FPCAs to diverse workloads and to balance density, speed, energy, and functional flexibility.

7. Application Domains

FPCAs target a diverse set of workloads:

General-purpose reconfigurable computing: Unified storage, arithmetic, and logic acceleration in a single platform, serving as a memory-centric alternative to CPU+DRAM hierarchies (Zidan et al., 2016).
In-memory AI/ML acceleration: In-situ vector-matrix multiply for neural inference, training with variation-aware mapping (e.g., ACM), and deployment as neuromorphic engines (Kazemi et al., 2020, Eshraghian et al., 2021).
Associative search and pattern matching: High-density, low-power TCAM for database search and network applications, leveraging field-programmable NOR crossbars (Meihar et al., 2023).
Edge and IoT devices: Low-power, high-density in-memory computing for local inference, feature extraction, and data logging (Zidan et al., 2016, Krestinskaya et al., 2018).
Scientific and big-data workloads: In-place linear algebra, histograms, and multi-operand operations for high-throughput analytics (Zidan et al., 2016).

By unifying diverse logic, memory, and analog primitives in a single, dynamically reconfigurable fabric, FPCAs represent a foundational architecture for post-von Neumann and beyond-CMOS computing paradigms, with ongoing work focused on scalability, automation, and device-circuit-algorithm co-design (Kalogeiton et al., 2016, Zidan et al., 2016, Bhattacharjee et al., 2018, Eshraghian et al., 2021, Meihar et al., 2023).