Papers
Topics
Authors
Recent
Search
2000 character limit reached

Set-Based MLPs

Updated 12 February 2026
  • Set-based Multi-Layer Perceptrons are neural networks that process unordered data by enforcing permutation invariance or equivariance using group-theoretic principles.
  • They employ parameter-tying with regular and high-order representations to achieve universal approximation while mitigating hidden layer size complexity.
  • Sparse and motif-based optimizations enhance computational efficiency, enabling practical applications in set classification, point cloud analysis, and modular network design.

Set-based Multi-Layer Perceptrons (MLPs) are a class of neural network architectures designed to process and learn functions on unordered data—sets—by leveraging group-theoretic invariance or equivariance, especially under the symmetric group SnS_n. These models formalize the requirement that outputs should not change (invariant) or should change predictably (equivariant) under permutations of set elements. Recent advances prove the universality of such architectures, showing that appropriately constructed MLPs can approximate any continuous invariant or equivariant function, and relate these structures to algebraic manipulations and computational gains achievable via architectural innovations and sparsity.

1. Group Actions and Set Symmetry in MLPs

Let G=SnG=S_n denote the symmetric group over nn elements; its action on input vectors xRnx\in\mathbb{R}^n is given by (gx)i=xg1(i)(g\cdot x)_i = x_{g^{-1}(i)}, or equivalently, xAgxx\mapsto A_g x with the permutation matrix Ag{0,1}n×nA_g\in\{0,1\}^{n\times n}. In Set-based MLPs, hidden and output layers are endowed with compatible permutation representations—i.e., the hidden layer is indexed by HH and equipped with the group action hghh\mapsto g\cdot h via HgH_g, and the output layer (of size mm) has BgB_g. These structures enforce linear layers W:RnRmW:\mathbb{R}^n\to\mathbb{R}^m to be SnS_n-equivariant if BgW=WAgB_g W = W A_g for all gSng\in S_n, and SnS_n-invariant if Bg=IB_g=I (the identity), requiring WAg=WW A_g = W for all gg (Ravanbakhsh, 2020).

2. Universal Approximation Theorems for Set-Based MLPs

For the design of universal set-invariant and set-equivariant approximators, two key theorems emerge:

  • Theorem A (Invariant universality): Consider a one-hidden-layer MLP with hidden layer size H=Sn=n!H=|S_n|=n!, where the group acts regularly on hidden units (gh=hg1g\cdot h = h g^{-1}). The network

ψ^(x)=hSnwhσ(whx+bh)\hat\psi(x)= \sum_{h\in S_n} w'_h\,\sigma(w_h^\top x + b_h)

with weights tied as wh=weAhw_h=w_e A_h and a shared bias bh=bb_h = b is a universal SnS_n-invariant approximator.

  • Theorem B (Equivariant universality): Under the same construction but allowing general output action BgB_g,

Ψ^(x)=hSnWhσ(Whx+bh)\hat\Psi(x) = \sum_{h\in S_n} W'_h\,\sigma(W_h x + b_h)

with Wh=HhWeW_h = H_h W_e and Wh=BhWeW'_h = B_h W'_e, yields a universal SnS_n-equivariant approximator (Ravanbakhsh, 2020).

The implementation exploits parameter-tying driven by group action and exploits the regular representation for parameter efficiency. Universality is achieved as these configurations can uniformly approximate any continuous SnS_n-invariant/equivariant function on compacta.

3. Hidden Layer Representation: Regular and High-Order Actions

While the regular representation yields a hidden-dimensional bottleneck of n!n!, high-order set actions enable more tractable models. Denoting [n]D[n]^D as the set of DD-tuples over [n][n], SnS_n acts diagonally. Proposition C states that a regular orbit exists for Dlog2((n1)!)D\geq\log_2((n-1)!), reducing the necessary hidden dimensionality. Corollary D quantifies the bound as D(n1)log2(n1)(n2)log2eD\geq (n-1) \log_2(n-1) - (n-2)\log_2 e. For the pure set action, D=nD=n suffices to guarantee a regular orbit, ensuring universality of the equivariant MLP (Ravanbakhsh, 2020). This insight allows for a polynomial- rather than factorial- sized hidden layer via DD-tuples.

4. Algebraic Structures on Set-Based MLPs

A formal algebraic framework on the universe UU of layered MLPs enables systematic construction of complex networks from simpler components. Operations include:

  • Complementation: The complement Nc\mathcal{N}^c inverts the output of a binary classifier.
  • Sum (Union): The sum N1+N2\mathcal{N}_1+\mathcal{N}_2 produces an (L+1)(L+1)-layer network representing the logical union.
  • Difference: Represented by N1N2=N1+(N2)c\mathcal{N}_1-\mathcal{N}_2 = \mathcal{N}_1 + (\mathcal{N}_2)^c.
  • I-Product (Cartesian product): N1×N2\mathcal{N}_1 \times \mathcal{N}_2 composes networks over direct-product domains.
  • O-Product (Output bundling): N1N2\mathcal{N}_1\otimes \mathcal{N}_2 stacks outputs for multi-label or structured outputs.

These operations possess formal algebraic properties such as involution (complement is its own inverse), commutativity and associativity (sum, I-product), and existence of identity/inverse elements (Peng, 2017).

5. Concrete Architectures and Implementation Recipes

For G=SnG=S_n using the regular representation, a minimal SnS_n-equivariant MLP comprises:

  1. Hidden units indexed by hSnh\in S_n
  2. Parameters: base input-weights wRnw\in\mathbb{R}^n, bias bRb\in\mathbb{R}, base output-weights URm×GU\in\mathbb{R}^{m\times |G|}
  3. Parameter-tying: wh=wAhw_h = w A_h, bh=bb_h = b
  4. Forward pass:
    • uh=σ(whx+b)u_h = \sigma(w_h^\top x + b) for all hh
    • z=hGUehuhz = \sum_{h\in G} U e_h\, u_h, where ehe_h is the one-hot vector for hh
    • Output zz transforms equivariantly: zBgzz \rightarrow B_g z when xAgxx\rightarrow A_g x

Examples clarify both SnS_n-equivariant (vector-valued, e.g., m=3m=3, Bg=AgB_g=A_g) and SnS_n-invariant (scalar-valued, pooling over all hidden units) cases (Ravanbakhsh, 2020).

6. Sparse and Motif-Based Optimization in Set-MLPs

Sparse Evolutionary Training (SET) introduces sparsity to MLPs through an Erdős–Rényi random initialization and periodic pruning/regrowth cycles. The motif-based structural optimization further imposes block-level structure by organizing neurons into blocks (motifs) and pruning/regrowing entire m×mm\times m submatrices, with block assignment guided by average weight magnitude. This approach reduces parameter count and computational cost by a factor of 1/m21/m^2 compared to standard SET, while maintaining high accuracy (empirical results: on Fashion-MNIST, motif size m=2m=2 yields 43.3% training time reduction for a 3.7% accuracy drop) (Chen et al., 10 Jun 2025).

A concise comparison of SET and motif-based SET is shown below:

Method Param Savings Accuracy Penalty Training Time Savings
SET (m=1) Baseline None Baseline
Motif-SET (m=2) 1/4\sim 1/4 <4% 30–43%
Motif-SET (m=4) 1/16\sim 1/16 10% >60%

Motif-SET achieves best score-efficiency tradeoff for m=2m=2 across tasks, with only minor loss in performance (Chen et al., 10 Jun 2025).

7. Applications, Limitations, and Outlook

Set-based MLPs are applicable to any domain with unordered or permutation-symmetric data, including set classification, point cloud analysis, and tasks demanding invariance or equivariance under data permutation. The algebraic approach enables construction of architectures tailored to data with decomposable or product structure, facilitating modular design and interpretation (Peng, 2017). Sparse and motif-based variants provide computationally efficient realizations suitable for high-dimensional feature selection and large-scale learning (Chen et al., 10 Jun 2025).

Limitations include exponential hidden-layer size in the worst case (full regular representation); however, high-order set representations and polynomially sized hidden layers mitigate this. The underlying algebra is specific to MLPs with fixed activations; extensions to convolutional or recurrent structures and the full range of group actions remain active research topics (Ravanbakhsh, 2020, Peng, 2017). Motif-based strategies open avenues for hardware-aware design and scaling, suggesting opportunities for further theoretical and empirical exploration (Chen et al., 10 Jun 2025).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Set-Based Multi-Layer Perceptrons (MLPs).