Papers
Topics
Authors
Recent
Search
2000 character limit reached

SEED-IV Dataset for EEG Emotion Recognition

Updated 5 January 2026
  • SEED-IV dataset is a publicly available corpus designed for EEG-based emotion recognition, featuring four discrete emotions elicited by naturalistic film clips.
  • The dataset employs a standardized preprocessing pipeline with band-pass filtering, ICA, and differential entropy feature extraction to ensure data quality and robust model calibration.
  • Its integration of synchronous EEG and eye-tracking data supports advanced multimodal analysis and has enabled benchmark studies in cross-dataset deep learning applications.

The SEED-IV dataset is a publicly available corpus for EEG-based emotion recognition, situated within the SEED family of datasets. SEED-IV is characterized by four discrete emotion classes elicited by naturalistic film stimuli and captured via high-density 62-channel EEG. Its protocol and data structure are engineered for subject-independent modeling and cross-dataset generalization studies, as exemplified by its pivotal role in recent contrastive learning architectures for affective brain-computer interfaces (Liao et al., 2024).

1. Participant Pool and Recording Protocol

SEED-IV consists of recordings from S=15S=15 healthy university students (7 male, 8 female), each participating in three separate sessions. Informed consent protocols were followed in accordance with ethical standards. No further demographic stratification is specified beyond the original release [27]. Each session includes 24 trials, and subjects complete a total of 72 trials (24×324 \times 3), where each trial corresponds to viewing an emotionally targeted film clip.

Recording was performed with a 62-channel ESI NeuroScan system using the international 10–20 montage. The initial reference electrode was Cz. Sampling rate is fs=200f_s = 200 Hz, yielding 200 samples per channel per second and matching the hardware configuration of SEED and SEED-V for cap type and amplifier bandwidth (Liao et al., 2024).

2. Stimulus Design and Emotion Categories

Emotion induction in SEED-IV is achieved through the presentation of 72 validated, Chinese-language film-scene clips. These clips are selected to maximize ecological validity and evoke specific emotional responses, forming four target categories:

  • Joy
  • Sorrow
  • Neutrality
  • Anxiety

During each session, 24 clips are presented in randomized order, structurally ensuring balanced exposure and statistical power for each emotional class. Only the last 30 s of each trial are retained for analysis to ensure emotional states stabilize post-stimulus onset.

3. Data Structure and Feature Extraction

Raw SEED-IV data is organized as a four-dimensional tensor

XRS×I×C×T=R15×72×62×6000\mathbf{X} \in \mathbb{R}^{S \times I \times C \times T} = \mathbb{R}^{15 \times 72 \times 62 \times 6000}

where SS is subjects, II is trials, CC is channels, and TT is time points (T=30 s×200 Hz=6000T = 30 \text{ s} \times 200 \text{ Hz} = 6000). Each element corresponds to a single EEG sample.

For downstream machine learning, the CLDTA architecture initially transforms this raw tensor via band-by-band differential entropy (DE) feature extraction. The DE for a signal segment XX is given by:

DE(X)=f(x)logf(x)dx=12log(2πeσ2)\text{DE}(X) = -\int f(x)\, \log f(x) \,dx = \frac{1}{2}\log(2\pi e \sigma^2)

where f(x)f(x) is the density and σ2\sigma^2 is variance in the specified frequency band. DE features are computed for five canonical bands: δ\delta (0.1–4 Hz), θ\theta (4–8 Hz), α\alpha (8–13 Hz), β\beta (13–31 Hz), γ\gamma (31–50 Hz).

4. Preprocessing Workflow

SEED-IV implements a standardized EEG preprocessing pipeline with EEGLAB:

a. Band-pass filter (0.01–48 Hz) and 50 Hz notch filter b. Channel rejection for prolonged flat signals (>>5 s), excessive variance (>>4× overall std), or low correlation (<<0.6) c. Trial (epoch) rejection if windowed variance exceeds 7×7\times channel variance d. Spherical spatial interpolation for rejected channels e. ICA decomposition (up to 5 artifact components manually discarded) f. Re-referencing to average g. Truncation to last 30 s (6000 samples/trial, all channels) h. DE feature extraction and temporal smoothing (linear dynamic system model, cf. [12])

All subjects are processed identically, ensuring consistency required for domain adaptation studies.

5. Data Composition and Unique Attributes

SEED-IV situates itself as an intermediate benchmark between SEED (3 emotions) and SEED-V (5 emotions). With four discrete emotional states and a total of 72 trials/subject, it affords robust within-subject and cross-subject statistical modeling while maintaining manageable data scale.

A distinguishing feature is synchronous EEG and eye-tracking acquisition; SEED-IV is the first in the series to be released with both modalities aligned per trial. While CLDTA uses only EEG channels, inclusion of gaze data supports future multimodal emotion modeling. SEED-IV thus offers richer class labels (joy, sorrow, neutral, anxiety) and a higher trial count per subject compared to predecessors.

6. Benchmarking and Application in Cross-Dataset Research

SEED-IV is regularly employed in training and validation of deep learning models designed for cross-dataset domain adaptation, such as CLDTA ("Contrastive Learning based on Diagonal Transformer Autoencoder") (Liao et al., 2024). Its channel-dense, film-stimulus protocol enables robust transfer learning across domain shifts in BCI research.

Researchers leverage SEED-IV for:

  • Subject-independent emotion decoding
  • Model calibration on minimal samples for new subjects
  • Visualization and interpretability of brain network representations (via information separation mechanisms)
  • Band-wise feature ablation and frequency-specific emotional signature studies

Table: SEED-IV Summary

Property Value Notes
Subjects (SS) 15 (7 M / 8 F) University students
EEG Channels (CC) 62 10–20 montage, ESI NeuroScan
Sampling Rate (fsf_s) 200 Hz Standard for SEED family
Trials/Session 24 3 sessions per subject
Total Trials 72 Last 30 s of each retained
Emotions Joy, Sorrow, Neutral, Anxiety Validated film stimuli
Data Tensor Shape 15×72×62×600015 \times 72 \times 62 \times 6000 Full raw recording
Eye Tracking Yes Synchronous with EEG

7. Context and Implications

SEED-IV’s design allows for testing deep-learning approaches targeting universality across acquisition devices, population samples, and stimulus formats. A plausible implication is its key role in the shift from controlled laboratory settings toward more generalizable emotion recognition pipelines.

Its moderate subject pool, granular emotional states, and dense EEG coverage make SEED-IV an essential dataset in modern affective BCI benchmarking, particularly for projects leveraging contrastive, transformer-based, or calibration-driven methods.

SEED-IV anchors the development of transferable models bridging laboratory and real-world emotional state decoding, and provides a robust, well-preprocessed platform for comparative studies across the SEED family and other canonical EEG emotion corpora (Liao et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SEED-IV Dataset.