Papers
Topics
Authors
Recent
Search
2000 character limit reached

EEG Frequency Clustering in Neonates

Updated 13 December 2025
  • The paper presents a methodology that combines canonical tail dependence and fuzzy c-means clustering to distinguish seizure from non-seizure neonates using EEG signals.
  • It integrates frequency-domain feature extraction with tail pairwise dependence matrices to capture rare, extreme joint activations in multichannel EEG data.
  • Empirical results show 100% accuracy on gamma band analysis and provide detailed spatial localization, underlining its potential for clinical seizure diagnosis.

Frequency-based soft clustering of neonates is a methodology that leverages tail dependence structures in multichannel EEG signals to identify diagnostically significant groupings, specifically distinguishing seizure from non-seizure brains. The approach integrates multivariate extreme value theory—namely, canonical tail dependence measures and tail pairwise dependence matrices (TPDM)—with frequency-domain feature extraction and fuzzy clustering. This pipeline enables interpretable, computationally efficient clustering in settings characterized by rare but extreme joint activity, as encountered in neonatal seizures (Talento et al., 6 Dec 2025).

1. Canonical Tail Dependence: Formal Definitions and Context

The foundational object is the canonical tail dependence (CTD) measure, λT\lambda_T, which quantifies the extremal dependence between two disjoint cortical-region feature vectors, XRpX\in\mathbb{R}^p and YRqY\in\mathbb{R}^q, drawn from a D-variate (D=p+qD=p+q) process Z=(X,Y)Z=(X^\top,Y^\top)^\top. The model assumes regular variation: each ZZ is regularly varying with tail index α>0\alpha>0, typically set to α=2\alpha=2 after marginal standardization to produce Pareto-like tails.

For any linear projections U=γXU=\gamma^\top X and V=βYV=\beta^\top Y, where (γ,β)(\gamma,\beta) are arbitrary weight vectors, the extremal dependence measure (EDM) is defined as: σ(U,V):=limr2E[U(U,V)V(U,V)(U,V)>r][0,1].\sigma(U,V) := \lim_{r\to\infty}2\mathbb{E}\left[\frac{U}{\|(U,V)\|}\frac{V}{\|(U,V)\|}\,\bigg|\,\|(U,V)\|>r\right] \in [0,1]. Canonical tail dependence is then the maximal squared EDM over all such projections, subject to variance-type identifiability constraints: λT:=maxγRp,βRq[σ(γX,βY)]2\lambda_T := \max_{\gamma\in\mathbb{R}^p,\beta\in\mathbb{R}^q}\left[\sigma(\gamma^\top X, \beta^\top Y)\right]^2 An equivalent formulation, using the TPDMs (see Section 2), is: λT=maxγΓXXγ=1betaΓYYβ=1(γΓXYβ)2\lambda_T = \max_{\substack{\gamma^\top\Gamma_{XX}\gamma=1\\beta^\top\Gamma_{YY}\beta=1}}(\gamma^\top\Gamma_{XY}\beta)^2 λT\lambda_T is the frequency-targeted analog of canonical correlation, extended to model dependence in the joint extreme tails.

2. Tail Pairwise Dependence Matrix: Derivation and Estimation

For a partitioned multichannel signal, the TPDM Γ\Gamma is constructed from the radial-angular decomposition Z=RψZ=R\psi with R=ZR=\|Z\|, ψ=Z/Z\psi=Z/\|Z\|, and ZZ standardized as previously described. The (j,k)(j,k) entry is: Γjk=DlimrE[ψjψkZ>r]=SD1ψjψkdHZ(ψ)\Gamma_{jk} = D\lim_{r\to\infty}\mathbb{E}\left[\psi_j\psi_k|\|Z\|>r\right]=\int_{S^{D-1}}\psi_j\psi_k\,dH_Z(\psi) Empirically, for a collection of BB features {Zb}b=1B\{Z_b\}_{b=1}^B and a high threshold rr (e.g., empirical 95th percentile of Zb\|Z_b\|), let c=b=1B1{Zb>r}c = \sum_{b=1}^B\mathbf{1}_{\{\|Z_b\|>r\}} and ψb=Zb/Zb\psi_b=Z_b/\|Z_b\|. The estimator is: Γ^jk=Dcb=1Bψb,jψb,k1{Zb>r}\hat{\Gamma}_{jk} = \frac{D}{c}\sum_{b=1}^B\psi_{b,j}\psi_{b,k}\,\mathbf{1}_{\{\|Z_b\|>r\}} This produces frequency-specific TPDMs when ZbZ_b are block-periodograms for a given frequency band.

3. Frequency-Domain Feature Extraction and Integration

Each EEG channel jj is partitioned into nonoverlapping blocks of length AA; for block bb and frequency ωa\omega_a, the local Fourier transform dj,b(ωa)d_{j,b}(\omega_a) and its power Zj,b(ωa)=dj,b(ωa)2Z_{j,b}(\omega_a)=|d_{j,b}(\omega_a)|^2 are computed. For standard EEG frequency bands Ω\Omega: Zj,b(Ω)=1Ω{a:SRωaΩ}Zj,b(ωa)Z_{j,b}(\Omega)=\frac{1}{|\Omega|}\sum_{\{a:\mathrm{SR}\cdot\omega_a\in\Omega\}}Z_{j,b}(\omega_a) Feature vectors Zb(Ω)Z_b(\Omega) are constructed by stacking over all DD channels. Each is marginal-standardized to Pareto(α=2\alpha=2), and TPDM Γ^(Ω)\hat{\Gamma}(\Omega) is estimated as above, yielding frequency-resolution in the resulting tail dependence quantification.

4. Frequency-Weighted Fuzzy C-Means Soft Clustering

The soft clustering is performed on the absolute tail-topology features Λn(Ω)=[γn;βn]R+D\Lambda_n(\Omega)=[|\gamma^*_n|;|\beta^*_n|]\in\mathbb{R}^D_+, where (γ,β)(\gamma^*,\beta^*) are obtained as follows:

  • Solve the generalized eigenproblem for G1=ΓXX1ΓXYΓYY1ΓYXG_1=\Gamma_{XX}^{-1}\Gamma_{XY}\Gamma_{YY}^{-1}\Gamma_{YX} (and similarly G2G_2), extracting the largest eigenvector uu; compute γ=ΓXX1/2u\gamma^* = \Gamma_{XX}^{-1/2}u.
  • Likewise, derive β=ΓYY1/2v\beta^* = \Gamma_{YY}^{-1/2}v.

The fuzzy c-means objective for SS clusters (usually S=2S=2 for seizure vs non-seizure) and cluster centroids {Cs}s=1S\{C_s\}_{s=1}^S is: Jm(U,C)=n=1Ns=1SunsmΛnCs2J_m(U, C) = \sum_{n=1}^N \sum_{s=1}^S u_{ns}^m\|\Lambda_n - C_s\|^2 with fuzzy memberships uns[0,1]u_{ns}\in[0,1], suns=1\sum_s u_{ns}=1, and m>1m>1. Iterative updates proceed via closed-form membership and centroid updates until convergence.

5. Algorithmic Pipeline: Stepwise Overview

A summary workflow is as follows:

Step Description Output
Preprocessing EEG artifact cleaning, channel removal Cleaned EEG data
Feature Extraction Block-periodograms, marginal standardization {Zb(Ω)}\{Z_b(\Omega)\}
Tail Dependence Estimation Threshold selection, TPDM estimation, eigenproblem solution Λn(Ω)\Lambda_n(\Omega)
Soft Clustering Fuzzy c-means on Λn(Ω)\Lambda_n(\Omega), hard/fuzzy label assignment Cluster assignments, UU
Model Selection Iterate over (Ω,m)(\Omega, m), select best wrt. expert labels Optimal (Ω,m)(\Omega^*,m^*)

This pipeline is iteratively applied across EEG frequency bands and fuzziness parameters mm, evaluating clustering accuracy on expert-labeled seizure/non-seizure ground truth.

6. Discrimination of Seizure vs Non-Seizure Neonates: Empirical Results and Interpretation

On a dataset of N=14N=14 neonates (6 non-epileptic, 8 epileptic), CTD-based extremal clustering at the gamma band (Ω=(30,50]\Omega=(30,50] Hz) with m1.2m\approx 1.2 achieved 100% accuracy, while classical canonical correlation analysis (CCA)-based fuzzy clustering on bulk features did not exceed 80%\approx80\% accuracy. Tomographic visualization of Λn\Lambda_n showed that, for seizure subjects, maximal tail dependence between FrontoTemporal and OcciParietal lobes was driven by right-hemisphere electrodes (e.g., T6, P4, O2), consistent with expert localization.

The fuzzy-membership matrix UU sharply separated clusters (un,seizure1u_{n,\mathrm{seizure}}\approx 1 or 0), with the exception of two non-epileptics demonstrating mild membership ambiguity due to borderline tail patterns. The approach’s focus on joint extremes (rather than mean correlation structure) enabled identification of diagnostically-critical, simultaneous high-amplitude events, with the highest discriminatory power in high-frequency gamma bursts.

A plausible implication is that the combination of frequency targeting, tail dependence modeling, and fuzzy clustering is uniquely sensitive to rare, pathologically meaningful network synchronizations that characterize seizure activity, relative to traditional bulk-spectral approaches.

7. Generalizability and Applications

The described methodology synthesizes frequency-domain signal processing, multivariate extreme value analysis, and soft clustering. It is interpretable and computationally efficient, directly identifying channel-specific drivers of extreme joint activity, and achieves perfect separation of neonatal seizure states in the described case study. The pipeline can be readily adapted for other multichannel, extreme-event applications—such as in climate science or finance—where identifying drivers of extremal dependence is critical (Talento et al., 6 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Frequency-Based Soft Clustering of Neonates.