Papers
Topics
Authors
Recent
Search
2000 character limit reached

MFRC: Moral Foundations Reddit Corpus

Updated 16 February 2026
  • MFRC is a comprehensive benchmark dataset featuring annotated Reddit comments for multi-label moral sentiment analysis and fairness evaluation.
  • The corpus spans multiple moral foundations like care, fairness, and authority, with detailed metrics on inter-annotator agreement and domain transfer.
  • Empirical studies using MFRC highlight key insights in NLP fairness, cross-domain generalization, and ethical AI alignment through robust evaluation protocols.

The Moral Foundations Reddit Corpus (MFRC) is a benchmark dataset for computational analysis of moral sentiment and foundation detection within social media, specifically covering user-generated content from Reddit. It is constructed to facilitate empirical evaluation and methodological development in NLP for subjective, multi-label moral classification tasks, with a focus on annotator agreement, domain transfer, and fairness-aware modeling. The MFRC is widely used in research on sentiment analysis, AI alignment, and cross-domain robustness, as documented in a range of studies (Naranbat et al., 13 Oct 2025, Skorski et al., 24 Jul 2025, Trager et al., 2022, Nguyen et al., 2023, Golazizian et al., 2024).

1. Data Collection, Structure, and Annotation Protocols

The MFRC was originally introduced by Trager et al. (2022) and is available via HuggingFace (USC-MOLA-Lab/MFRC). Corpus construction pooled comments from 12 Reddit subreddits into three major topical domains: US Politics (e.g., r/politics, r/conservative), French Politics (e.g., r/geopolitics, r/europe), and Everyday Morality (e.g., r/IamTheAsshole, r/relationship_advice). These subreddits were selected to maximize thematic and discursive diversity. Comments were required to have a minimum Reddit score (≥10 or ≥10 up-votes, depending on the bucket), comprise at least 10 tokens, and—where relevant—mention candidate political figures (for French political content).

The largest available version contains approximately 17,885–18,000 comments, varying slightly by preprocessing and label filtering details across studies. Each comment is a single utterance (not a thread), with average length of approximately 42 tokens (Nguyen et al., 2023, Skorski et al., 24 Jul 2025).

Annotation follows the updated Moral Foundations Theory (MFT) taxonomy, labeling each comment for several moral concerns: Care/Harm, Equality, Proportionality, Loyalty/Betrayal, Authority/Subversion, Purity/Sanctity, and, in some releases, Thin Morality (undifferentiated moral language) and an implicit/explicit flag (Trager et al., 2022). Typically, comments are multi-labeled if multiple foundations apply. Annotators (27–6, depending on the task or subsample) were trained for several weeks, and each comment is coded by at least two (up to five) annotators, with post-annotation aggregation for modeling.

Preprocessing and harmonization steps include conversion to lowercase, whitespace and special character stripping, label mapping (merging Equality and Proportionality into Fairness, and exclusion of Thin Morality and Purity where alignment with the Twitter corpus is required), and explicit dataset splits for in-domain and cross-domain experiments (Naranbat et al., 13 Oct 2025).

2. Corpus Statistics and Label Distributions

Corpus size and class prevalence are contingent on the harmonization scheme:

After harmonization to five core foundations, the in-domain label proportions (rounded from (Naranbat et al., 13 Oct 2025)) are:

Label Proportion
Non-Moral ≈ 38%
Care ≈ 12%
Fairness ≈ 11%
Loyalty ≈ 9%
Authority ≈ 8%

When all six classical MFT foundations are retained (including Purity/Sanctity):

Foundation Prevalence (Skorski et al., 24 Jul 2025)
Authority 19.2%
Care 26.5%
Fairness 29.5%
Loyalty 11.1%
Sanctity 9.8%

Notably, foundation prevalence varies by subreddit and topic bucket, with some categories (e.g., Purity) being especially rare outside religious/moral discussion subreddits.

3. Annotation Quality, Agreement, and Subjectivity

Inter-annotator agreement is quantified with prevalence- and bias-adjusted κ (PABAK), with domain-wide values in the medium range (PABAK ≈ 0.42–0.47) (Nguyen et al., 2023). Observed agreement (raw κ) is lower due to label class imbalance and the low frequency of certain foundations—a common challenge in moral sentiment datasets. Labels for each foundation are aggregated so that a comment is treated as positive if any annotator selected the label (logical OR). Annotator confidence is also recorded.

The MFRC's subjective variant (MFSC) uniquely offers post-level annotation by all annotators (24 undergraduate students, balanced across U.S. demographic characteristics), enabling research on annotator disagreement, sampling, and modeling personalized annotation policies (Golazizian et al., 2024). No classical κ or α inter-annotator reliability is reported for MFSC, but measures of item and annotator-level disagreement following Davani et al. (2023) are computed.

4. Experimental Protocols and Baseline Model Evaluation

Studies leveraging MFRC standardize on stratified or random 80/10/10 or 80/20 splits for train, dev, and test partitions in in-domain setups. For cross-domain evaluation, the full MFRC training set is used to test on the Moral Foundations Twitter Corpus (MFTC) and vice versa, with labels harmonized to the shared subset {Care, Fairness, Loyalty, Authority, Non-Moral} (Naranbat et al., 13 Oct 2025).

Modeling approaches include fine-tuning transformers (BERT-base, DistilBERT, DeBERTa-v3-base) in multi-label settings (problem_type="multi_label_classification") using binary cross-entropy with logits: LBCELogits(y,z)=1Li=1L[yilogσ(zi)+(1yi)log(1σ(zi))]\mathcal{L}_{\mathrm{BCELogits}}(\mathbf{y}, \mathbf{z}) = -\frac{1}{L}\sum_{i=1}^{L}[y_i\log\sigma(z_i)+(1-y_i)\log(1-\sigma(z_i))] Hyperparameters include AdamW, learning rate 2×10⁻⁵, 5 epochs, batch size 16–32, and single NVIDIA A100 GPUs. No augmentation or sampling is applied; label distributions remain naturally imbalanced (Naranbat et al., 13 Oct 2025).

Alternative modeling uses LLMs (e.g., Llama 3 8B, Mistral 8B, GPT-4o-mini, Claude) with zero-shot and few-shot prompt engineering and parameter-efficient fine-tuning (PEFT). Baseline scoring employs micro-F1, precision, recall, ROC/PR AUC, and the Balanced Error Rate.

Fine-tuned transformers routinely outperform LLMs on MFRC, with higher per-foundation F1s and much lower false negative rates. For example, DeBERTa-v3-base achieves F1 0.38–0.80 per foundation, versus GPT-4o-mini's F1 0.12–0.55 (by foundation) (Skorski et al., 24 Jul 2025). Multi-label prediction remains challenging for all models, amplifying the need for domain-specific fine-tuning.

5. Fairness, Generalization, and Diagnostic Metrics

MFRC is foundational for fairness-aware model evaluation under domain shift. The core diagnostic metrics used include:

ΔDP(m)=P(y^m=1A=Twitter)P(y^m=1A=Reddit)\Delta_{\mathrm{DP}}^{(m)} = |P(\hat y_m=1\mid A=\text{Twitter}) - P(\hat y_m=1\mid A=\text{Reddit})|

ΔEO(m)=maxy{0,1}Pr(y^m=1Ym=y,A=Twitter)Pr(y^m=1Ym=y,A=Reddit)\Delta_{\mathrm{EO}}^{(m)} = \max_{y\in\{0,1\}}|\Pr(\hat y_m=1\mid Y_m=y, A=\text{Twitter}) - \Pr(\hat y_m=1\mid Y_m=y, A=\text{Reddit})|

  • Moral Fairness Consistency (MFC):

Diff(m)=PerfRedditTwitter(m)PerfTwitterReddit(m),MFC(m)=1Diff(m)\mathrm{Diff}^{(m)} = |\mathrm{Perf}_{\text{Reddit}\to\text{Twitter}}^{(m)} - \mathrm{Perf}_{\text{Twitter}\to\text{Reddit}}^{(m)}|,\quad \mathrm{MFC}^{(m)}=1-\mathrm{Diff}^{(m)}

Empirical results indicate pronounced asymmetry in generalization: transferring from Twitter to Reddit degrades micro-F1 by 14.9%, while Reddit to Twitter degrades by only 1.5% (Naranbat et al., 13 Oct 2025). Per-label cross-domain fairness disparities are greatest for Authority (ΔDP ≈ 0.22–0.23, ΔEO ≈ 0.40–0.41) and lowest for Loyalty and Fairness (ΔDP ≈ 0.03–0.05). MFC correlates perfectly negatively (ρ = –1.000, p < 0.001) with DPD, and remains statistically independent of F1, precision, and recall, establishing it as an orthogonal cross-domain stability metric.

Label ΔDP (95%CI) ΔEO (95%CI) MFC (DistilBERT)
authority 0.22 (0.22–0.23) 0.40 (0.39–0.41) 0.7781 (0.7741–0.7822)
care 0.04 (0.04–0.05) 0.26 (0.25–0.28) 0.9556 (0.9537–0.9576)
fairness 0.05 (0.05–0.05) 0.22 (0.21–0.23) 0.9499 (0.9472–0.9524)
loyalty 0.03 (0.03–0.03) 0.20 (0.19–0.21) 0.9666 (0.9647–0.9684)
non-moral 0.08 (0.08–0.08) 0.34 (0.33–0.36) 0.9205 (0.9179–0.9234)

The authority dimension exhibits both the lowest MFC and the greatest domain-specificity, largely due to the rarity and platform-specificity of authority cues in Reddit language (Naranbat et al., 13 Oct 2025).

6. Extensions, Subjective Annotation Variants, and Model Personalization

A smaller, exhaustively annotated subset (the Moral Foundations Subjective Corpus, MFSC) comprises 2,000 Reddit posts labeled by 24 undergraduates, yielding 48,000 annotation decisions. Each annotator assigns one of six moral foundations (Purity, Harm, Loyalty, Authority, Proportionality, Equality) or "non-moral," along with a three-level confidence score (Golazizian et al., 2024). Annotator-level features (e.g., Big Five personality survey data) are included. The MFSC supports annotation-budget optimization, active/annotator-adaptive modeling, and personalized prediction. No standard inter-annotator reliability coefficients are published for this variant; performance disparity is instead managed through item and annotator disagreement metrics.

Known limitations include the demographic homogeneity of annotators, lack of thread or conversational context, class imbalance (certain foundations underrepresented), and English-language–only content. These factors must be considered in downstream modeling and generalization studies.

7. Applications, Implications, and Recommendations for Use

MFRC is a principal benchmark for multi-label moral sentiment analysis, model fairness diagnostics, subjective annotation research, and AI alignment evaluation. It is suited for:

Best practices include computing per-label ΔDP/ΔEO, tracking MFC to diagnose cross-domain gaps, using an 80/10/10 split for in-domain work, and fixing random seeds/bootstrapping 95% confidence intervals (n=1000) for all metrics. Open code, splits, and checkpoints are recommended for reproducibility (Naranbat et al., 13 Oct 2025).

In aggregate, the MFRC offers a high-quality, richly annotated resource for empirical study of moral sentiment, domain-sensitive fairness evaluation, and the development of equitable, generalizable moral reasoning models in NLP.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Moral Foundations Reddit Corpus (MFRC).