HICEM-15: Cross-Cultural Emotion Model
- HICEM-15 is a data-driven emotion model featuring 15 semantically orthogonal labels derived from unsupervised word embedding clustering.
- The methodology combines FastText embeddings, UMAP dimensionality reduction, and agglomerative clustering to capture cross-lingual affective semantics.
- Validation with large-scale human annotation datasets demonstrates HICEM-15’s high semantic coverage and efficient information recovery.
HICEM-15 is a data-driven, high-coverage model of discrete human emotions, optimized for cross-cultural artificial emotional intelligence (AEI) systems via unsupervised analysis of word embeddings. Designed to maximize semantic coverage while minimizing category count, HICEM-15 defines a set of 15 semantically orthogonal emotion labels, systematically validated for alignment across six major world languages and for fidelity to human-perceived affective distinctions in real-world affect annotation tasks (Wortman et al., 2022).
1. Model Definition and Taxonomy
HICEM-15 (HIgh-Coverage EMotion model, 15 components) is constructed to provide a minimal yet comprehensive basis for the annotation and computational modeling of human emotions. Its set of 15 summary categories—Neutral, Happiness, Sadness, Anger, Fear, Surprise, Pain, Pleasure, Annoyance, Confusion, Doubt, Discomfort, Awe, Enjoyment, and Bizarre—were determined by agglomerative clustering in word embedding space, with cultural alignment such that each concept can be realized across Arabic, Chinese, English, French, Spanish, and Russian. Table 1 presents the full cross-lingual mapping:
| English | Arabic | Chinese | French | Spanish | Russian |
|---|---|---|---|---|---|
| Neutral | حيادي | 中性 | Neutre | Neutral | Нейтрально |
| Happiness | سعادة | 幸福 | Joie | Felicidad | Счастье |
| Sadness | حزن | 悲伤 | Tristesse | Tristeza | Печаль |
| Anger | غضب | 愤怒 | Colère | Ira | Гнев |
| Fear | خوف | 害怕 | Peur | Miedo | Страх |
| Surprise | دهشة | 惊讶 | Surprise | Sorpresa | Удивление |
| Pain | ألم | 疼痛 | Douleur | Dolor | Боль |
| Pleasure | متعة | 愉悦 | Plaisir | Placer | Удовольствие |
| Annoyance | إزعاج | 烦恼 | Agacement | Fastidio | Раздражение |
| Confusion | ارتباك | 困惑 | Confusion | Confusión | Замешательство |
| Doubt | شك | 怀疑 | Doute | Duda | Сомнение |
| Discomfort | انزعاج | 不适 | Malaise | Malestar | Дискомфорт |
| Awe | رهبة | 敬畏 | Émerveillement | Asombro | Трепет |
| Enjoyment | استمتاع | 享受 | Bonheur | Disfrute | Наслаждение |
| Bizarre | غريب | 奇怪 | Bizarre | Extraño | Странный |
These labels are not intended to represent a theory of emotion but function as empirically grounded semantic centroids in affective space (Wortman et al., 2022).
2. Construction Pipeline: Embedding, Dimensionality Reduction, and Clustering
The HICEM-15 set is derived via the following unsupervised pipeline:
- Emotion-Concept List Construction: Begins with 1,720 English emotion-concept words (from sources such as Ekman, Plutchik, and online affective lexicons). Expansion uses pre-trained Word2Vec to identify semantic neighbors, followed by manual pruning to exclude adverbs, non-affective terms, and archaisms.
- Embedding and UMAP Reduction: All words are embedded with FastText (300d, trained on CommonCrawl and Wikipedia). The UMAP (Uniform Manifold Approximation and Projection) algorithm reduces the space to two dimensions, using a cosine metric, , and to preserve semantic similarity and antonymy distinctions.
- Agglomerative Clustering: Ward linkage clustering in UMAP space identifies semantically compact clusters, with the distortion “elbow” method applied to select . Individual language results consistently indicate as optimal.
- Cross-Lingual Aggregation: The entire process is repeated post-translation in each target language, followed by global reclustering of top centroids (from the top-50 summary words per language) to . Resultant centroids yield final HICEM-15 summary terms in each language (Wortman et al., 2022).
3. Semantic Coverage and Information Recovery Metrics
Model evaluation emphasizes two primary quantitative metrics:
- Average Coverage (AvgCov): For model label set and set of all embedded concepts , average cosine similarity is computed over each to its nearest :
Aggregated across the six languages, HICEM-15 achieves , which is comparable to Cowen (27 labels) and GoEmotions (28), but with roughly half the number of categories.
- Recoverable Information (AvgRec): Measures fidelity to human annotation by simulating an oracle projecting ground-truth instance embeddings into HICEM-15’s space, then reconstructing the original embedding via a learned regression function. The metric is the average cosine similarity between ground-truth and reconstructed vectors:
HICEM-15 yields , outperforming random 15-category selections and approaching models with notably more labels. Plutchik-32 achieves the highest values, but at the cost of increased annotation complexity (see Table below).
| Model | #Labels | AvgCov_total | AvgRec_total |
|---|---|---|---|
| Ekman | 7 | 0.314 | 0.327 |
| Plutchik | 32 | 0.428 | 0.552 |
| HICEM-15 | 15 | 0.416 | 0.464 |
| HICEM-25 | 25 | 0.444 | 0.528 |
4. Cross-Lingual and Cross-Cultural Alignment
Cross-lingual methodology ensures each centroid corresponds to a semantically stable affective direction in embedding space across the six target languages. The initial English master list is machine-translated, embedded with native-language FastText models, UMAP-reduced, and then clustered analogously to the English pipeline. The top-50 summary labels are pooled and globally reclustered to define k=15 centroids, yielding a final set of semantically equivalent affective categories attested in all six languages. Only post hoc manual filtering for rare or archaic terms is performed, with semantic directions otherwise algorithmically anchored (Wortman et al., 2022).
5. Validation with Large-Scale Human Annotation Datasets
Empirical validation uses two major corpora:
- BoLD (Body Language Dataset): ≈20,000 video clips with human annotations for 26 emotions plus Valence–Arousal–Dominance (VAD) dimensions.
- EMOTIC: ≈34,000 images with the same categorical and VAD labels.
- Recoverable information is computed as an upper bound, using ground-truth instance labels and ridge regression in the HICEM-15 embedding, with results indicating that 15 categories can support information recovery levels well beyond random baselines and approaching those of much larger discrete sets.
- Projecting dataset annotations onto HICEM-15 and then into VAD space reproduces expected psychological structure (e.g., Russell’s Circumplex), with valence as principal axis and dominance strongly correlated with valence.
6. Implications and Application Domains
HICEM-15 is directly applicable to AEI (artificial emotional intelligence) for next-generation tasks requiring interpretability and efficiency:
- Annotation Efficiency and Disagreement Reduction: By identifying a minimal, non-redundant set of categories, HICEM-15 minimizes both labelling costs and inter-annotator disagreement.
- Cross-Cultural Deployment: Semantic centroids defined in six major languages support transfer learning and robust affect annotation in global systems.
- Modularity and Hierarchy: The underlying clustering admits extensions (e.g., HICEM-25, HICEM-30) for domains requiring additional granularity.
- Affective Computing Applications: Immediate uses include social robotics (affect tagging), human–machine dialogue systems (empathetic response), and digital phenotyping in mental healthcare.
- Model Complementarity: HICEM-15’s discrete labels are recommended to be paired with continuous valence–arousal embeddings, as dominance is highly correlated with valence in data.
7. Comparative Position among Emotion Models
HICEM-15 improves on older emotion-models by optimizing the trade-off between semantic coverage and category count. Unlike Ekman’s and Plutchik’s paradigms, HICEM-15 categories are empirically derived, embedding-grounded, language-independent centroids. The model outperforms random category subsets and achieves competitive coverage and information recovery vis-à-vis much larger models, while retaining interpretability and efficiency crucial for scalable affective annotation (Wortman et al., 2022).