SEED-IV Dataset for EEG Emotion Recognition

Updated 5 January 2026

SEED-IV dataset is a publicly available corpus designed for EEG-based emotion recognition, featuring four discrete emotions elicited by naturalistic film clips.
The dataset employs a standardized preprocessing pipeline with band-pass filtering, ICA, and differential entropy feature extraction to ensure data quality and robust model calibration.
Its integration of synchronous EEG and eye-tracking data supports advanced multimodal analysis and has enabled benchmark studies in cross-dataset deep learning applications.

The SEED-IV dataset is a publicly available corpus for EEG-based emotion recognition, situated within the SEED family of datasets. SEED-IV is characterized by four discrete emotion classes elicited by naturalistic film stimuli and captured via high-density 62-channel EEG. Its protocol and data structure are engineered for subject-independent modeling and cross-dataset generalization studies, as exemplified by its pivotal role in recent contrastive learning architectures for affective brain-computer interfaces (Liao et al., 2024).

1. Participant Pool and Recording Protocol

SEED-IV consists of recordings from $S=15$ healthy university students (7 male, 8 female), each participating in three separate sessions. Informed consent protocols were followed in accordance with ethical standards. No further demographic stratification is specified beyond the original release [27]. Each session includes 24 trials, and subjects complete a total of 72 trials ( $24 \times 3$ ), where each trial corresponds to viewing an emotionally targeted film clip.

Recording was performed with a 62-channel ESI NeuroScan system using the international 10–20 montage. The initial reference electrode was Cz. Sampling rate is $f_s = 200$ Hz, yielding 200 samples per channel per second and matching the hardware configuration of SEED and SEED-V for cap type and amplifier bandwidth (Liao et al., 2024).

2. Stimulus Design and Emotion Categories

Emotion induction in SEED-IV is achieved through the presentation of 72 validated, Chinese-language film-scene clips. These clips are selected to maximize ecological validity and evoke specific emotional responses, forming four target categories:

Joy
Sorrow
Neutrality
Anxiety

During each session, 24 clips are presented in randomized order, structurally ensuring balanced exposure and statistical power for each emotional class. Only the last 30 s of each trial are retained for analysis to ensure emotional states stabilize post-stimulus onset.

3. Data Structure and Feature Extraction

Raw SEED-IV data is organized as a four-dimensional tensor

$\mathbf{X} \in \mathbb{R}^{S \times I \times C \times T} = \mathbb{R}^{15 \times 72 \times 62 \times 6000}$

where $S$ is subjects, $I$ is trials, $C$ is channels, and $T$ is time points ( $T = 30 \text{ s} \times 200 \text{ Hz} = 6000$ ). Each element corresponds to a single EEG sample.

For downstream machine learning, the CLDTA architecture initially transforms this raw tensor via band-by-band differential entropy (DE) feature extraction. The DE for a signal segment $X$ is given by:

$24 \times 3$ 0

where $24 \times 3$ 1 is the density and $24 \times 3$ 2 is variance in the specified frequency band. DE features are computed for five canonical bands: $24 \times 3$ 3 (0.1–4 Hz), $24 \times 3$ 4 (4–8 Hz), $24 \times 3$ 5 (8–13 Hz), $24 \times 3$ 6 (13–31 Hz), $24 \times 3$ 7 (31–50 Hz).

4. Preprocessing Workflow

SEED-IV implements a standardized EEG preprocessing pipeline with EEGLAB:

a. Band-pass filter (0.01–48 Hz) and 50 Hz notch filter b. Channel rejection for prolonged flat signals ( $24 \times 3$ 85 s), excessive variance ( $24 \times 3$ 94× overall std), or low correlation ( $f_s = 200$ 00.6) c. Trial (epoch) rejection if windowed variance exceeds $f_s = 200$ 1 channel variance d. Spherical spatial interpolation for rejected channels e. ICA decomposition (up to 5 artifact components manually discarded) f. Re-referencing to average g. Truncation to last 30 s (6000 samples/trial, all channels) h. DE feature extraction and temporal smoothing (linear dynamic system model, cf. [12])

All subjects are processed identically, ensuring consistency required for domain adaptation studies.

5. Data Composition and Unique Attributes

SEED-IV situates itself as an intermediate benchmark between SEED (3 emotions) and SEED-V (5 emotions). With four discrete emotional states and a total of 72 trials/subject, it affords robust within-subject and cross-subject statistical modeling while maintaining manageable data scale.

A distinguishing feature is synchronous EEG and eye-tracking acquisition; SEED-IV is the first in the series to be released with both modalities aligned per trial. While CLDTA uses only EEG channels, inclusion of gaze data supports future multimodal emotion modeling. SEED-IV thus offers richer class labels (joy, sorrow, neutral, anxiety) and a higher trial count per subject compared to predecessors.

6. Benchmarking and Application in Cross-Dataset Research

SEED-IV is regularly employed in training and validation of deep learning models designed for cross-dataset domain adaptation, such as CLDTA ("Contrastive Learning based on Diagonal Transformer Autoencoder") (Liao et al., 2024). Its channel-dense, film-stimulus protocol enables robust transfer learning across domain shifts in BCI research.

Researchers leverage SEED-IV for:

Subject-independent emotion decoding
Model calibration on minimal samples for new subjects
Visualization and interpretability of brain network representations (via information separation mechanisms)
Band-wise feature ablation and frequency-specific emotional signature studies

Table: SEED-IV Summary

Property	Value	Notes
Subjects ( $f_s = 200$ 2)	15 (7 M / 8 F)	University students
EEG Channels ( $f_s = 200$ 3)	62	10–20 montage, ESI NeuroScan
Sampling Rate ( $f_s = 200$ 4)	200 Hz	Standard for SEED family
Trials/Session	24	3 sessions per subject
Total Trials	72	Last 30 s of each retained
Emotions	Joy, Sorrow, Neutral, Anxiety	Validated film stimuli
Data Tensor Shape	$f_s = 200$ 5	Full raw recording
Eye Tracking	Yes	Synchronous with EEG

7. Context and Implications

SEED-IV’s design allows for testing deep-learning approaches targeting universality across acquisition devices, population samples, and stimulus formats. A plausible implication is its key role in the shift from controlled laboratory settings toward more generalizable emotion recognition pipelines.

Its moderate subject pool, granular emotional states, and dense EEG coverage make SEED-IV an essential dataset in modern affective BCI benchmarking, particularly for projects leveraging contrastive, transformer-based, or calibration-driven methods.

SEED-IV anchors the development of transferable models bridging laboratory and real-world emotional state decoding, and provides a robust, well-preprocessed platform for comparative studies across the SEED family and other canonical EEG emotion corpora (Liao et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

CLDTA: Contrastive Learning based on Diagonal Transformer Autoencoder for Cross-Dataset EEG Emotion Recognition (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SEED-IV Dataset.

SEED-IV Dataset for EEG Emotion Recognition

1. Participant Pool and Recording Protocol

2. Stimulus Design and Emotion Categories

3. Data Structure and Feature Extraction

4. Preprocessing Workflow

5. Data Composition and Unique Attributes

6. Benchmarking and Application in Cross-Dataset Research

7. Context and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

SEED-IV Dataset for EEG Emotion Recognition

1. Participant Pool and Recording Protocol

2. Stimulus Design and Emotion Categories

3. Data Structure and Feature Extraction

4. Preprocessing Workflow

5. Data Composition and Unique Attributes

6. Benchmarking and Application in Cross-Dataset Research

7. Context and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research