Mobile EEG Dataset Overview

Updated 23 January 2026

Mobile EEG datasets are collections of portable EEG recordings capturing brain activity in natural environments to support BCIs, cognitive research, and clinical applications.
They integrate diverse hardware and recording setups with advanced preprocessing pipelines to balance ecological validity with signal quality.
Evaluation metrics such as AUC, SNR, and regression measures are used to benchmark neural and behavioral insights across varied use cases.

Mobile EEG datasets are collections of electroencephalography (EEG) recordings acquired using portable or wearable EEG systems outside of traditional laboratory environments. Such datasets are increasingly pivotal for the development and validation of brain–computer interfaces (BCIs), assessment of cognitive and affective dynamics in real-world contexts, and the creation of clinically actionable insights with affordable hardware. This overview details leading mobile EEG datasets, emphasizing their acquisition protocols, technical architecture, preprocessing pipelines, primary use cases, and evaluation methodologies, with references to recent openly available resources.

1. Hardware Modalities and Recording Setups

Mobile EEG datasets span a spectrum from consumer-grade, single-channel devices to high-density, multi-channel systems adapted for field deployment. Representative configurations include:

Consumer-Grade Headsets: Example: Muse S 2 Headband (Afonso et al., 18 Mar 2025) with five dry electrodes (TP9, TP10, AF7, AF8, Fpz as reference), 256 Hz sampling, 16-bit resolution. Used in scenarios integrating EEG with webcam-based eye tracking for artifact-driven gaze estimation.
Clinical-Grade Wearables: E.g., Scan NuAmps Express with a 64-channel Quik-Cap (10–20 system, 1000 Hz, impedance < 10 kΩ), as in EEG-SVRec (Zhang et al., 2024), enabling fine-grained spatial coverage albeit with wired constraints.
Hybrid Systems: Mobile BCI datasets (Lee et al., 2021) utilize 32-channel scalp EEG, 14-channel around-ear cEEGrid arrays (SMARTING, 500 Hz), plus 4-channel EOG and multi-site inertial measurement units (IMUs) to capture body motion artifacts.
Minimalist Wearables: Single- or three-electrode designs such as NeuroSky MindWave Mobile 2 (Tabib et al., 22 Oct 2025) or UAIS-LAB BIAS v2.0 (Cai et al., 2020). These focus on prefrontal coverage (Fp1, Fpz, Fp2) with dry or passive electrodes, 24-bit resolution, and sampling rates between 250–512 Hz.

Synchronization strategies typically employ the Lab Streaming Layer (LSL) protocol for precise temporal alignment of EEG, stimulus, and behavioral streams (e.g., (Afonso et al., 18 Mar 2025, Wilroth et al., 21 Jan 2026)).

2. Experimental Paradigms and Contextual Diversity

Mobile EEG datasets cover an array of ecological and cognitive tasks:

Attention and Conversation Tracking: Multi-talker audiovisual paradigms with sustained, switching, and conversation attention blocks, as in (Wilroth et al., 21 Jan 2026). Recordings include scalp and cEEGrid signals, behavioral markers (switch cues, question responses), and synchronized speech/eye tracking data.
Eye Movement Tracking: EEG-based eye-tracking via hybrid (EEG + webcam) artifacts during controlled smooth pursuit and saccadic gaze tasks (Afonso et al., 18 Mar 2025).
Affective and Recommendation Contexts: Short-video browsing, self-directed content engagement, and multidimensional affect labeling (valence, arousal, immersion, interest, visual/auditory quality) alongside behavioral events (likes, swipes; (Zhang et al., 2024)).
Motor and Locomotion Studies: Standing, walking, and running on treadmill setups with ERP/SSVEP paradigms under variable movement speed for BCI benchmarking (Lee et al., 2021).
Clinical and Pathological Cohorts: Resting and active states for epilepsy patient stratification (Tabib et al., 22 Oct 2025) and depression diagnosis (resting-state only, (Cai et al., 2020)).

Trial structures, session lengths, and block organization are dataset-specific, with total experimental times per participant ranging from minutes (consumer-grade, clinical) to multiple hours (high-density, affective/cognitive studies).

3. Data Formats, Organization, and Access

Mobile EEG datasets adopt standardized organizational schemas:

Dataset	Primary File Formats	Access Mechanism
(Afonso et al., 18 Mar 2025)	CSV (denormalized, per-paradigm), XDF	Zenodo, GitHub, documented loader
(Lee et al., 2021)	BrainVision (.vhdr/.vmrk/.dat), CSV, MAT	OSF, BIDS-compatible
(Zhang et al., 2024)	BIOS/CNT (EEG), CSV (behavior, labels)	Project repository
(Tabib et al., 22 Oct 2025)	CSV (band-power vectors, metadata)	GitHub, PhysioNet (planned)
(Cai et al., 2020)	TXT (raw signals), XLSX (metadata)	University portal, EULA required
(Wilroth et al., 21 Jan 2026)	XDF, BrainVision, EDF, MAT, NPY	Institutional/OpenNeuro (planned)

Most datasets provide both raw and preprocessed signals, behavioral logs, and supplementary metadata (channel locations, participant demographics, event codes). Preprocessing code is typically available, with detailed documentation of variable names, train/test splits, and auxiliary feature descriptors.

4. Signal Preprocessing and Quality Control

Preprocessing protocols are systematically documented to enable reproducibility:

Filtering and Artifact Handling: Standard procedures include notch filters for line noise (50/60 Hz), band-pass filters (e.g., 0.5–40 Hz or 1–45 Hz), and artifact correction (amplitude thresholding, EOG-derived adaptive noise canceling—(Cai et al., 2020)). ICA-based artifact removal and re-referencing (average, bipolar) are used in multi-channel datasets (Lee et al., 2021, Wilroth et al., 21 Jan 2026).
Missing Value Imputation: For consumer-grade datasets, missing values (long sequences of zeros) are interpolated using state-space Kalman smoothers with SARIMA models (Afonso et al., 18 Mar 2025).
Segmentation: Data are segmented into epochs (variable window sizes: 1–8 s) appropriate for downstream analysis.
Normalization: Channel-wise z-scoring to counter inter-subject and inter-channel variability (Wilroth et al., 21 Jan 2026, Cai et al., 2020).
Feature Extraction: Band power estimation (e.g., Welch’s method), differential entropy, and higher-order spectral/temporal features for downstream modeling (Zhang et al., 2024, Tabib et al., 22 Oct 2025).

5. Benchmarking Methodologies and Validation Metrics

Evaluation strategies and performance metrics differ with application:

BCI Performance: Area Under Curve (AUC) for ERP classification, SSVEP accuracy, and SNR measures (signal-to-baseline or target/neighbor band ratios) as primary quality checks (Lee et al., 2021). SNR is computed as $10 \log_{10}(P_{signal}/P_{noise})$ .
Regression/Prediction Tasks: MSE (Mean Squared Error) and correlation metrics are encouraged for regression-based eye-tracking (Afonso et al., 18 Mar 2025). Models such as FM, DeepFM, Wide&Deep are used for affective label and rating prediction, with MSE and AUC as primary loss/validation functions (Zhang et al., 2024).
Patient Stratification: Accuracy and silhouette scores for clustering patients by EEG feature embeddings relative to clinical ground-truth labels, notably “seizure frequency change” for epilepsy (Tabib et al., 22 Oct 2025).
Attention Decoding: Attention reconstruction accuracy (percentage correct per decision window), correlation between predicted and actual attended speech envelope, and latency/amplitude measures of TRF components (e.g., P2-peak) (Wilroth et al., 21 Jan 2026).

Some datasets do not publish explicit regression or classification baselines; researchers are expected to define and report these using standard statistical criteria.

6. Applications and Unique Features

Mobile EEG datasets have catalyzed progress in several domains:

Robust Brain–Computer Interfaces: Enables comparative evaluation of scalp vs. ear EEG, artifact-resilience analysis under real-world motion, and translational deployment in spelling, device control, and assistive robotics (Lee et al., 2021, Wilroth et al., 21 Jan 2026).
Low-Cost, Accessible Clinical Screening: Validates that consumer-grade, single-channel data—despite limited spatial coverage—can support basic patient stratification and community health monitoring (Tabib et al., 22 Oct 2025).
Mobile Attention and Affective Research: Provides insights into neural mechanisms of attention switching, conversational engagement, and affect-informed services (video recommendation) under ecologically valid, multisensory conditions (Zhang et al., 2024, Wilroth et al., 21 Jan 2026).
Benchmarking and Fairness Audits: Openly accessible datasets with extensive contextual metadata support the evaluation of algorithmic bias, performance under diverse subject and environmental conditions, and integration of brain data with behavioral profiles (Tabib et al., 22 Oct 2025, Afonso et al., 18 Mar 2025).

7. Limitations, Best Practices, and Future Perspectives

Mobile EEG datasets are constrained by hardware characteristics (sampling rate, electrode density), artifact susceptibility, and, at times, modest participant counts:

Consumer-grade systems typically lack the spatial and spectral fidelity required for high-resolution mapping or diagnostic use (e.g., (Tabib et al., 22 Oct 2025, Afonso et al., 18 Mar 2025)).
Motion and muscle artifacts are pervasive, necessitating concurrent IMU data or stringent artifact rejection protocols.
Short recording durations may limit the detection of sporadic neurophysiological events (Tabib et al., 22 Oct 2025).
Ecological validity vs. signal quality trade-offs must be navigated, especially in naturalistic or self-directed paradigms.
Ethical constraints around privacy and data use agreements are strictly enforced, with comprehensive de-identification of subject metadata.

Researchers are advised to complement mobile EEG with auxiliary modalities (IMU, EOG, behavioral logs), maintain strict preprocessing pipelines, perform fairness and subgroup analysis, and adhere to relevant licensing and citation requirements. A plausible implication is that the combination of open-access resources, extensive contextual annotation, and community-driven methods can enable broader, interdisciplinary adoption and innovation.

References:

"Consumer-grade EEG-based Eye Tracking" (Afonso et al., 18 Mar 2025)
"Mobile BCI dataset of scalp- and ear-EEGs with ERP and SSVEP paradigms while standing, walking, and running" (Lee et al., 2021)
"EEG-SVRec: An EEG Dataset with User Multidimensional Affective Engagement Labels in Short Video Recommendation" (Zhang et al., 2024)
"MODMA dataset: a Multi-modal Open Dataset for Mental-disorder Analysis" (Cai et al., 2020)
"Affordable EEG, Actionable Insights: An Open Dataset and Evaluation Framework for Epilepsy Patient Stratification" (Tabib et al., 22 Oct 2025)
"Neural Tracking of Sustained Attention, Attention Switching, and Natural Conversation in Audiovisual Environments using Mobile EEG" (Wilroth et al., 21 Jan 2026)