LookAroundNet EEG Seizure Detection

Updated 16 January 2026

The paper demonstrates that incorporating bidirectional temporal context via a transformer significantly enhances EEG seizure detection accuracy.
It employs a detailed architecture with patch embeddings, spatial attention, and ensembling of context variants to achieve robust cross-dataset performance.
The model maintains low computational requirements with over 10× real-time throughput and minimal memory usage, supporting practical clinical deployment.

LookAroundNet is a transformer-based model for automated EEG seizure detection that explicitly incorporates extended temporal context around events of interest. The architecture leverages both past and future EEG signal to emulate the clinical practice of interpreting events in relation to their surrounding context. Developed and evaluated across diverse datasets reflecting a wide range of clinical conditions, recording modalities, and patient populations, LookAroundNet demonstrates robust event detection, generalization, and computational efficiency compatible with real-world deployment (Sverrisson et al., 9 Jan 2026).

1. Model Architecture

The LookAroundNet architecture processes multichannel EEG signals sampled at 128 Hz and arranged in the standard longitudinal bipolar montage (18 channels). For each input instance, a multichannel window of total duration $T = T_{\rm pre} + T_{\rm tgt} + T_{\rm post}$ seconds is extracted, where $T_{\rm pre}$ and $T_{\rm post}$ are the durations of contextual signal before and after the central target window $T_{\rm tgt}$ .

Pre-processing involves bandpass filtering (0.5–64 Hz), notch filtering, reflective padding, and downsampling to 128 Hz.

Input traces for each channel $i$ , denoted $x_i(t)$ , are segmented into non-overlapping patches of $P=48$ samples, forming $N = \frac{T f_s}{P}$ patches. Each patch $\mathbf{x}^p_i \in \mathbb{R}^P$ is mapped into an embedding $\mathbf{e}^p_i \in \mathbb{R}^d$ ( $d=96$ ) via a two-layer convolutional front-end. For each channel, patch embeddings with positional encodings are processed by a stack of $L=3$ temporal transformer encoder layers (384 feedforward dimension, dropout 0.1, 3 attention heads) to yield contextualized representations $\mathbf{H}_i \in \mathbb{R}^{N \times d}$ . Mean pooling over time produces a per-channel summary $\mathbf{h}_i$ .

Channel summaries for all $C=18$ channels are stacked as $\mathbf{H}_{\rm ch} \in \mathbb{R}^{C\times d}$ , with learnable channel positional encodings, and passed through a single multi-head spatial attention layer (3 heads, dimension $d$ ). The output is fed into a dropout (0.5) and a fully-connected layer, projecting to logits for seizure versus non-seizure classes for the central target segment, followed by a softmax for classification.

2. Temporal Contextualization Mechanism

LookAroundNet encodes surrounding EEG context by concatenating signal before and after the target interval directly at the input. In the core offline (non-causal) mode, the temporal transformer attends bidirectionally across the full concatenated context, without any causal attention masks. This design allows the model to exploit both anticipatory and retrospective electrographic cues. For real-time or zero-latency online applications, a strict look-behind mechanism is possible through causal masking, albeit discarding the look-ahead segment.

Mathematically, the input for each channel is constructed as: $\mathbf{x}_i = [x_i(-T_{\rm pre}f_s : 0),\ x_i(0:T_{\rm tgt}f_s),\ x_i(T_{\rm tgt}f_s : (T_{\rm tgt}+T_{\rm post})f_s)]$ This design enables flexible compatibility with varied context window sizes and alignments.

3. Training Methodology

The network is optimized using label-smoothed cross-entropy loss with smoothing $\epsilon=0.1$ , computed as

$\ell_{\rm CE} = -\sum_{k\in\{0,1\}} \left[(1-\epsilon)\,y_k+\frac{\epsilon}{2}\right]\log \hat p_k$

where $y \in \{0,1\}$ is the ground truth seizure label for the central segment and $\hat{p}$ is the model’s softmax output.

Training is performed using AdamW (no schedule) at a learning rate of $5 \times 10^{-4}$ , batch size 512, and for 200 epochs. Each training epoch samples 60,000 examples composed equally of fully seizure, fully non-seizure, and "mixed" (label majority) central segments. No explicit data augmentation is employed beyond randomization of start times.

The training regime leverages extensive clinical and home EEG datasets:

TUSZ: 579 subjects (208 with seizures), 2,421 seizures, totaling 37 days 22h of EEG;
Kvikna (proprietary): 254 subjects, 1,099 seizures, 589 days 7h (using only 1h sub-segments with ≥1 seizure);
Validation: TUSZ 53 subjects (1,081 seizures), Kvikna 36 (243 seizures);
Test: TUSZ 43 (469), Kvikna 49 (514), Siena 14 (47), SeizeIT1 42 (182) including home EEG recordings.

Three independently trained context-variant models are ensembled: (1) look-behind only ( $(-64,0)$ s), (2) look-ahead only ( $(0,64)$ s), and (3) symmetric ( $(-32,32)$ s), each with $T_{\rm tgt}=16$ s. Final probabilities are averaged.

4. Evaluation Protocol and Empirical Results

Evaluation employs the SzCORE framework with the following metrics:

Event-based sensitivity (Seizure detection rate):

${\rm Sens} = \frac{\#\,\text{detected seizures}}{\#\,\text{true seizures}}$

Event-based precision:

${\rm Prec} = \frac{\#\,\text{correct events}}{\#\,\text{predicted events}}$

Event-based F1:

$F_1 = 2\frac{\rm Prec \times Sens}{\rm Prec + Sens}$

False predictions per day (FP/day):
Sample-based metrics: based on comparison of binarized (1 Hz) outputs to ground truth.

Test Set	F1 (single)	Sens (single)	FP/day (single)	F1 (ensemble)	Sens (ensemble)	FP/day (ensemble)
TUSZ	72.1%	74.2%	12.7	77.8%	69.2%	3.4
Siena	54.9%	69.1%	3.2	68.5%	62.7%	0.8
SeizeIT1	26.4%	58.5%	1.7	47.0%	49.0%	0.4
Kvikna	21.8%	47.4%	6.5	31.2%	38.4%	2.5

Ablation on context window size demonstrated that symmetric context ( $(-32,32)$ s) outperformed models trained without context ( $F_1$ 64.4% vs 72.1% on TUSZ), with diminishing returns beyond ±32 s ( $F_1$ 73.8% at $(-64,64)$ ), and that ensemble approaches surpassed all single-window models (e.g., TUSZ $F_1$ 77.8%).

Training set diversity was assessed by combining TUSZ with additional datasets (Siena, SeizeIT1, Kvikna). Augmentation with the large, heterogeneous Kvikna cohort yielded superior cross-dataset sensitivity at comparable FP/day, establishing the impact of training data heterogeneity.

5. Computational Characteristics and Clinical Suitability

LookAroundNet maintains a small model size (0.5M parameters per single model, 1.4M for ensemble) and low inference cost (approximately 2 GFLOPs per second of EEG processed in 2-second sliding steps). Measured inference times are as follows:

Hardware	1 h EEG Inference + Pre-proc Time (s)
NVIDIA Quadro RTX 5000	5.3
DGX Spark	3.09
Apple M3 Pro	12.7

Real-time throughput exceeds 10× on all hardware tested, including commodity laptop CPUs. The memory footprint is under 10 MB of device RAM.

Latency can be tuned: the look-behind only variant allows immediate online detection (zero look-ahead latency), while symmetric contextualization introduces a detection lag equal to the look-ahead duration (e.g., 32 s for ±32 s context).

Real-world suitability is supported by low observed false alarm rates (2–6 FP/day) at clinically acceptable sensitivities, hardware-agnostic implementation, and a modular design accommodating different deployment scenarios (e.g., ICU alarms versus offline retrospective analysis). These attributes are critical for integration into routine EEG workflow and ambulatory or home-monitoring settings.

6. Contributions and Position in Automatic Seizure Detection

LookAroundNet advances EEG seizure detection by unifying extended bidirectional temporal context with efficient transformer-based modeling. Its explicit context representation aligns with neurologists’ interpretive strategies, while strong, cross-dataset-validated performance and computational feasibility bridge the gap toward practical clinical deployment. Innovations in context-window ensembling and the use of heterogeneous, large-scale EEG data directly address prior limitations in generalization and robustness. The approach demonstrates state-of-the-art efficacy on both controlled clinical and unconstrained ambulatory/home recordings, substantiating the central thesis that temporal context, training diversity, and model ensembling are key for clinically viable EEG event detection (Sverrisson et al., 9 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

LookAroundNet: Extending Temporal Context with Transformers for Clinically Viable EEG Seizure Detection (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to LookAroundNet.