IC-MLP: Input-Connected MLP
- IC-MLP is a feedforward neural network architecture that integrates direct affine connections from the input to every hidden layer, ensuring robust universal approximation.
- Its recursive formulation and explicit input connections enable algebraic closure and richer function spaces compared to standard MLPs.
- The design accommodates both scalar and vector inputs, promoting enhanced expressiveness and practical applications in network theory and approximation.
The Input-Connected Multilayer Perceptron (IC-MLP) is a feedforward neural network architecture distinguished by direct affine connections from the raw input to every hidden unit in all hidden layers, in addition to standard inter-layer connectivity. Each hidden neuron, rather than relying solely on the output of the preceding layer, also incorporates an affine transformation of the input vector. This architectural modification, studied in both univariate and multivariate formulations, gives rise to a network class with robust algebraic closure properties and a universal approximation theorem under the minimal criterion that the activation function is nonlinear (Ismailov, 20 Jan 2026).
1. Formal Definition of IC-MLP Architecture
For scalar input and a fixed continuous activation function , the IC-MLP is defined recursively over hidden layers as follows:
- Let (input node output).
- For (hidden layers), each hidden layer output is
with an weight matrix, an input-weight vector, and a bias vector.
- The output layer computes
reducing to a scalar output when . Here, is typically a row vector (), , .
In the multivariate setting with , the affine input term is replaced by for each neuron.
2. Layerwise and Iterated Formulas
The IC-MLP structure supports explicit, systematic descriptions of its function space for any finite depth:
- Depth 0:
- Depth 1: With hidden units—parameters , output weights , input weight , bias :
- Depth 2: For second-layer units—weights , input weights , biases , output weights :
In general, the -layer functional form is
or in matrix notation, .
3. Universal Approximation Theorem for IC-MLP
Let be continuous. The following are equivalent:
- is nonlinear (i.e., not of the form ).
- For all closed intervals , every , and every , there exists an and an -layer IC-MLP such that
Proof Outline: If is affine, IC-MLPs are affine functions and cannot approximate arbitrary continuous functions. For nonlinear , one constructs smooth approximants via mollification and shows, by explicit symmetric differences, that and all monomials are in the closure of the function space realized by IC-MLPs, hence all polynomials, and then all continuous functions by the Weierstrass theorem (Ismailov, 20 Jan 2026).
4. Extension to Vector-Valued Inputs
For , the IC-MLP maintains the direct affine input connection at every layer by incorporating terms of the form in each hidden neuron. The universal approximation result also extends:
- For any compact , any , and any , there exists an IC-MLP such that
if and only if is nonlinear.
The function class realized by IC-MLPs is closed under addition and superposition with scalar IC-MLPs, contains all constants and projections , and supports construction of all multivariate monomials. The Stone–Weierstrass theorem then ensures density in .
5. Algebraic and Expressive Properties Compared to Standard MLPs
IC-MLPs differ from standard MLPs in key architectural and algebraic respects:
- In standard MLPs, only the first hidden layer receives the input directly; in IC-MLPs, every hidden layer and the output receive an independent affine input.
- The IC-MLP function class is closed under finite linear combinations. Summing two IC-MLPs yields another IC-MLP, facilitated by concatenating their outputs at the final layer. In typical MLPs, closure under addition holds only under strong restrictions on .
- For shallow MLPs, universal approximation requires to be non-polynomial; for deep MLPs, non-affinity plus smoothness is needed. IC-MLPs admit the sharp minimal condition: is nonlinear, for both scalar and vector-valued cases.
- The algebraic structure of the IC-MLP hypothesis class supports direct use of classical density arguments with less technical machinery.
| Property | IC-MLP | Standard MLP |
|---|---|---|
| Direct input per layer | Yes | Only first layer |
| Universality criterion | non-affine | Non-polynomial or non-affine + smoothness |
| Algebraic closure | Addition, multiplication | Restricted |
6. Implications for Network Theory and Approximation
The introduction of direct input connections at every hidden layer enables IC-MLPs to generate a strictly richer function space than standard MLPs: closed under addition and multiplication, containing all affine and polynomial functions as special cases. The simplicity of the universality condition and the recursive, transparent structure of proofs positions IC-MLPs as a theoretically robust model for exploring the topology and algebraic structure of neural network hypothesis classes, with implications for both functional analysis of neural networks and the study of universal approximation in deep learning architectures (Ismailov, 20 Jan 2026).