Fully Recursive Perceptron Network (FRPN)

Updated 1 January 2026

FRPN is a recursive neural model that refines its hidden state via iterative updates until reaching a fixed-point equilibrium, simulating deep feedforward networks with shared parameters.
The C-FRPN extension applies recursive computation to convolutional layers, yielding adaptive, multi-dimensional feature maps and outperforming standard CNNs in low-parameter scenarios.
Training FRPNs involves backpropagation through time with convergence checks, combined with regularization techniques like L2 weight decay and dropout to ensure stability and efficiency.

A Fully Recursive Perceptron Network (FRPN) is a recursive neural architecture in which the hidden representation is refined via an iterative process, converging to a fixed-point equilibrium rather than propagating through a fixed-depth stack of distinct layers. This mechanism enables the simulation of arbitrarily deep feedforward networks within a compact, parameter-shared framework. The Convolutional Fully Recursive Perceptron Network (C-FRPN) extends this recursive formulation to convolutional neural networks (CNNs), enabling recursive computation over multi-dimensional feature maps and yielding variable-depth architectures adaptive to input and convergence dynamics. Empirical evaluation demonstrates that C-FRPNs consistently surpass standard CNNs in accuracy for a given parameter budget, especially in the small-network regime, indicating superior parameter efficiency and modeling capacity (Rossi et al., 2019).

1. Mathematical Formulation of FRPN

The FRPN core model is defined by an iterative update for the hidden state vector. For input vector $u \in \mathbb{R}^m$ , the hidden state at iteration $t$ is $x(t) \in \mathbb{R}^n$ . Let $\alpha \in \mathbb{R}^{n \times m}$ denote the input-to-hidden weight matrix, $\beta \in \mathbb{R}^{n \times n}$ the hidden-to-hidden weight matrix, $b \in \mathbb{R}^n$ the bias, and $f : \mathbb{R} \to \mathbb{R}$ an elementwise nonlinearity (e.g., ReLU). The recurrence is:

$x(t) = f\left( \alpha u + \beta x(t-1) + b \right), \quad t=1,2,3, \dots$

with $x_i(t) = f\left(\sum_{j=1}^m \alpha_{ij} u_j + \sum_{k=1}^n \beta_{ik} x_k(t-1) + b_i \right)$ for $i=1, \ldots, n$ .

The network output is computed by a standard feedforward head: $y = g(W_o x(T) + b_o)$ , where $g$ is the output nonlinearity.

At convergence, the hidden state satisfies the fixed-point condition:

$x^* = f(\alpha u + \beta x^* + b)$

Provided $f$ is Lipschitz with constant $L_f$ and $\|\beta\|_2 L_f < 1$ , the iteration yields a unique fixed point via contraction mapping.

2. Unfolding and Connection to Deep Feedforward Networks

Unrolling the iteration over $T$ steps recovers a $T$ -layer feedforward network with parameter sharing across layers, since each transformation is identical in form. The unfolded representations:

$\begin{align*} x(1) &= f(\alpha u + \beta x(0) + b) \ x(2) &= f(\alpha u + \beta x(1) + b) \ &\vdots \ x(T) &= f(\alpha u + \beta x(T-1) + b) \end{align*}$

Formally, any deep multilayer perceptron of depth $T$ can be exactly represented, for suitable $n$ and parameter selection, by an FRPN after $T$ unfolding steps (Rossi et al., 2019). The equilibrium perspective eliminates the arbitrary selection of “depth” in favor of dynamically determined iterative computation.

3. Training Methodology

FRPNs are trained with objective functions standard to their task domain, e.g., cross-entropy loss for classification or MSE for regression, evaluated at the output layer. Training employs backpropagation through time (BPTT), potentially truncated at $t_{\max}$ , to compute gradients, and uses Adam or SGD with momentum for parameter updates. Regularization techniques include $L_2$ weight decay, dropout (applied to output or recurrent links), and optional batch normalization interleaved with iteration steps. Convergence in practice is monitored by the criterion:

$\|x(t) - x(t-1)\|_2 < \varepsilon$

or $t = t_{\max}$ (with $\varepsilon = 0.1$ , $t_{\max} = 8$ in reported experiments).

4. Convolutional Extension: C-FRPN

In C-FRPN, each FRPN hidden state and input are stacks of 2D feature maps, replacing vectorial operations with convolutions. For $S$ input feature maps $\mathbf{u} = \{u_s\}_{s=1}^S$ and $K$ state maps $\mathbf{x}(t) = \{x_k(t)\}_{k=1}^K$ , the update is

$x_k(t) = f\!\left( \sum_{s=1}^S U_{k,s} * u_s + \sum_{k'=1}^K W_{k,k'} * x_{k'}(t-1) + b_k \right)$

where $U_{k,s}$ and $W_{k,k'}$ are convolution kernels, $*$ denotes 2D convolution, and $b_k$ is a bias term. Each recursive block comprises one such C-FRPN layer followed by pooling.

C-FRPN architectures typically stack four recursive blocks, each followed by $3 \times 3$ max-pooling (stride 2) and dropout ( $p=0.5$ except after the final block). Within blocks, the first convolution uses $5 \times 5$ kernels, subsequent recursions use $3 \times 3$ kernels, and local-response normalization is applied after each iteration.

5. Experimental Results and Performance Analysis

Evaluation on standard image benchmarks—including CIFAR-10, SVHN, and ISIC melanoma classification—demonstrates that C-FRPN consistently outperforms parameter-matched CNNs. In the low-parameter regime ( $\sim 20$ K parameters):

CIFAR-10: Baseline CNN $69.1\% \pm 1.2\%$ , C-FRPN $72.4\% \pm 0.8\%$
SVHN: CNN $84.5\% \pm 0.9\%$ , C-FRPN $87.1\% \pm 0.7\%$
ISIC: CNN $78.3\% \pm 1.5\%$ , C-FRPN $81.9\% \pm 1.2\%$

For wider models ($100$ K–$500$ K), the advantage narrows to $1$–$2$ percentage points but remains consistent. All experiments used Adam optimizer (learning rate $10^{-4}$ , weight decay $5 \times 10^{-4}$ ), five-trial medians, and standard augmentation.

The following table summarizes classification accuracy (mean ± std) from the reported experiments:

Dataset	CNN (small model)	C-FRPN (small model)
CIFAR-10	$69.1\% \pm 1.2\%$	$72.4\% \pm 0.8\%$
SVHN	$84.5\% \pm 0.9\%$	$87.1\% \pm 0.7\%$
ISIC	$78.3\% \pm 1.5\%$	$81.9\% \pm 1.2\%$

The largest gains are found in small models ( $<50$ K parameters), indicating increased expressive efficiency.

6. Implementation Considerations and Parameter Efficiency

The depth of computation in each FRPN or C-FRPN block is dynamically determined by a convergence test, rather than fixed architectural design. Typical networks use up to $t_{\max} = 8$ iterations per block. C-FRPN widths in reported benchmarks were $[96, 85, 74, 60, 30, 15]$ feature maps per layer. The architecture exploits recurrence and parameter sharing to reduce redundancy, enabling smaller models to attain accuracies comparable to substantially larger CNNs.

Regularization is essential for training stability, with $L_2$ weight decay, batch normalization after each iterative step, and dropout after pooling. Contraction property of $f$ and $\beta$ guarantees stable equilibrium under standard settings.

A plausible implication is that the recursive mechanism inherent to FRPN and C-FRPN architectures induces a form of implicit depth adaptation and parameter reuse, leading to superior performance-to-parameter ratios relative to conventional feedforward designs.

7. Significance and Applications

FRPN and C-FRPN architectures generalize the deep neural network paradigm by replacing rigid, architecturally determined depth with a learned, recursive computation converging to equilibrium. This results in adaptive, parameter-efficient models. Their demonstrated advantages on image classification tasks, especially in low-parameter or resource-constrained settings, mark these models as strong alternatives for compact or embedded deployment.

These architectures also suggest a principled route to simulating arbitrarily deep networks without explicit depth design, and their underlying principles parallel recurrent equilibrium models and deep implicit networks, contributing to the broader understanding of recursive computation within neural architectures (Rossi et al., 2019).

Markdown Report Issue Upgrade to Chat

References (1)

Embedding of FRPN in CNN architecture (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fully Recursive Perceptron Network (FRPN).