MentaLLaMA-chat-13B: Mental Health Analysis LLM

Updated 27 January 2026

MentaLLaMA-chat-13B is a 13-billion-parameter LLM fine-tuned on a curated IMHI dataset to deliver transparent mental health predictions.
The model builds on Meta’s LLaMA2-chat-13B, employing multi-task tuning and explanation-driven training for clear, human-level rationales.
Benchmark results and detailed evaluations show MentaLLaMA-chat-13B achieves competitive performance and superior explanation quality in mental health informatics.

MentaLLaMA-chat-13B is a 13-billion-parameter open-source LLM explicitly fine-tuned to deliver interpretable mental health analysis on social media texts. Its foundation is the Meta AI LLaMA2-chat-13B model—a decoder-only Transformer with extensive instruction-following capabilities. MentaLLaMA-chat-13B extends this core with multi-task, explanation-driven fine-tuning and evaluation on a rigorously curated Interpretable Mental Health Instruction (IMHI) dataset, formalizing prediction and reasoning tasks central to mental health informatics. The model demonstrates performance approaching leading discriminative pretrained LLMs in both predictive correctness and the generation of detailed, human-level rationales, advancing the field’s capacity for transparent and explainable AI in mental health contexts (Yang et al., 2023).

1. Model Architecture and Instructional Foundation

MentaLLaMA-chat-13B inherits its base from Meta’s LLaMA2-chat-13B, which comprises 40 stacked Transformer layers and a total of 13 billion parameters. Each layer operates with a hidden dimension of 5,120, features 40 self-attention heads, employs SwiGLU feedforward activations, and utilizes rotary position embeddings, supporting context sequences up to 4,096 tokens. The “chat” designation indicates initialization from a checkpoint that has already undergone instruction tuning and RLHF, adhering to the LLaMA2-chat [INST] prompt format with system and user segments. No modifications are made to the Transformer core; architectural changes are constrained to the output layer and instruction-following enhancements via mental health–specific data (Yang et al., 2023).

2. Multi-Task Tuning and Training Objectives

The fine-tuning protocol formalizes eight diverse mental health analysis tasks, including:

Binary depression and loneliness detection
Stress detection
Multi-class disorder classification
Cause/factor identification
Wellness-dimension tagging
Interpersonal risk factor detection

Each instance takes the format: "Post: … Question: …" with the model outputting: Answer: [Label]. Reasoning: [free-form explanation].

The training objective is next-token prediction (cross-entropy) over the combined answer and reasoning strings. Formally, for dataset $D = \{(q, r)\}$ , optimization maximizes

$L(\varphi) = \sum_{(q, r)} \sum_{j=1}^{|r|} \log P_\varphi(r_j \mid q, r_{< j})$

No auxiliary or consistency regularization is imposed; alignment relies explicitly on sequence likelihood maximization to elicit high-quality, explanatory outputs (Yang et al., 2023).

3. IMHI Dataset Construction and Explanation Curation

The IMHI dataset consists of 105,000 annotated instruction–response samples, aggregated from ten public social-media sources covering eight major analysis tasks. The included topics and splits are shown below:

Task	Training / Validation / Test samples
Depression detection (DR, CLP)	1,003 / 430 / 405 and 456 / 196 / 299
Stress detection (Dreaddit)	2,837 / 300 / 414
Multi-class disorders (SWMH, T-SID)	34,822 / 8,705 / 10,882 and 3,071 / 767 / 959
Stress-cause (SAD)	5,547 / 616 / 684
Depression/suicide-cause (CAMS)	2,207 / 320 / 625
Loneliness detection	2,463 / 527 / 531
Wellness-dimension detection (MultiWD)	15,744 / 1,500 / 2,441
Interpersonal risk factor (IRF)	3,943 / 985 / 2,113

To address the lack of explanatory rationales in source datasets, responses were generated via ChatGPT, prompted with task-specific templates and expertly-written few-shot examples (35 per task, 350 gold exemplars total). Explanation quality was strictly vetted through automated checks (BART-Score) and human evaluation (criteria: correctness, consistency, overall quality), ensuring high informational reliability and psychological validity (Yang et al., 2023).

4. Fine-Tuning Protocols and Implementation Details

All model variants (including MentaLLaMA-chat-13B) were fine-tuned for 10 epochs on the IMHI train split, with the following hyperparameter settings:

Optimizer: AdamW
Peak learning rate: $1 \times 10^{-5}$ (linear warmup over the first 3% of steps)
No weight decay or explicit regularization
Batch size: 32 sequences; gradient accumulation (effective batch: 256)
Sequence length: truncated/padded to 2,048 tokens
Flash-Attention activated for accelerated self-attention
Hardware: four NVIDIA A100 GPUs (80 GB each)
Fine-tuning runtime: approximately 48 hours for chat-13B

Inference adopts the LLaMA2-chat prompt convention, e.g.,

1
2
3

<s>[INST] <<SYS>> You are an assistant for mental health analysis. <</SYS>
User: Post: "..." Question: "…?" [/INST]
Assistant:

Greedy decoding (temperature 0.0, no beam search) is used to yield concise, deterministic explanations (Yang et al., 2023).

5. Benchmark Performance and Explanation Quality

The evaluation uses the IMHI benchmark (10 test sets) to assess both prediction correctness and explanation quality. MentaLLaMA-chat-13B achieves weighted F1 scores matching or exceeding leading discriminative pretrained LLMs (PLMs) on 7 of 10 test sets. Selected F1 values:

Task	Weighted F1 (%)
Depression Reddit (DR)	85.7
Stress detection	75.8
Interpersonal risk (IRF)	76.5
Wellness dimensions	75.1
CAMS cause detection	45.5
T-SID Twitter disorders	75.3

Explanatory quality, as measured by automated BART-Score, consistently surpasses baselines including BART-large and T5-large, and MentaLLaMA-chat-13B outperforms its 7B counterpart by over 0.20 on eight sets. Human ratings (N=200) render mean scores (0–3 scale): Consistency (2.6), Reliability (2.4), Professionality (1.8), Overall (2.2). Consistency and reliability are on par with ChatGPT, though professionality is lower—this suggests further domain pretraining may be beneficial (Yang et al., 2023).

When four tasks were excluded from fine-tuning, MentaLLaMA-chat-13B surpassed zero-shot ChatGPT in F1 on three unseen tasks and delivered explanations superior (per BART-Score) to T5 and BART, indicating robust generalization within the mental health domain (Yang et al., 2023).

6. Task-Oriented Outputs and Interpretability

MentaLLaMA-chat-13B’s output format is designed for transparency. For example, given the prompt:

1 2	Post: I’ve been feeling like nothing I do matters. Some days I can’t even drag myself out of bed. Question: Does the poster suffer from depression?

The model returns:

1
2

Answer: Yes.
Reasoning: The user explicitly describes pervasive feelings of worthlessness (‘nothing I do matters’) and an inability to engage in normal daily activities (‘can’t even drag myself out of bed’). These are hallmark symptoms of major depressive episodes, indicating that the poster is experiencing clinical depression.

This structured output elucidates the clinical reasoning underlying predictions, enhancing interpretability for both research and applied settings (Yang et al., 2023).

7. Implications and Future Directions

MentaLLaMA-chat-13B constitutes the first open-source LLM series for instruction-following, interpretable mental health analysis, balancing predictive accuracy and rationale quality. It addresses prior data and availability gaps by introducing an IMHI corpus and benchmarking approach, facilitating the development and evaluation of domain-specific explainable models. A plausible implication is that its release lowers barriers for research into fair, transparent, and robust mental health informatics on social media. Its approach—combining large-scale instruction-tuning, careful explanation curation, and robust evaluation—establishes a methodological blueprint for emerging specialization of LLMs in sensitive application domains (Yang et al., 2023).

Markdown Report Issue Upgrade to Chat

References (1)

MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to MentaLLaMA-chat-13B Model.