MentaLLaMA-chat-13B: Mental Health Analysis LLM
- MentaLLaMA-chat-13B is a 13-billion-parameter LLM fine-tuned on a curated IMHI dataset to deliver transparent mental health predictions.
- The model builds on Meta’s LLaMA2-chat-13B, employing multi-task tuning and explanation-driven training for clear, human-level rationales.
- Benchmark results and detailed evaluations show MentaLLaMA-chat-13B achieves competitive performance and superior explanation quality in mental health informatics.
MentaLLaMA-chat-13B is a 13-billion-parameter open-source LLM explicitly fine-tuned to deliver interpretable mental health analysis on social media texts. Its foundation is the Meta AI LLaMA2-chat-13B model—a decoder-only Transformer with extensive instruction-following capabilities. MentaLLaMA-chat-13B extends this core with multi-task, explanation-driven fine-tuning and evaluation on a rigorously curated Interpretable Mental Health Instruction (IMHI) dataset, formalizing prediction and reasoning tasks central to mental health informatics. The model demonstrates performance approaching leading discriminative pretrained LLMs in both predictive correctness and the generation of detailed, human-level rationales, advancing the field’s capacity for transparent and explainable AI in mental health contexts (Yang et al., 2023).
1. Model Architecture and Instructional Foundation
MentaLLaMA-chat-13B inherits its base from Meta’s LLaMA2-chat-13B, which comprises 40 stacked Transformer layers and a total of 13 billion parameters. Each layer operates with a hidden dimension of 5,120, features 40 self-attention heads, employs SwiGLU feedforward activations, and utilizes rotary position embeddings, supporting context sequences up to 4,096 tokens. The “chat” designation indicates initialization from a checkpoint that has already undergone instruction tuning and RLHF, adhering to the LLaMA2-chat [INST] prompt format with system and user segments. No modifications are made to the Transformer core; architectural changes are constrained to the output layer and instruction-following enhancements via mental health–specific data (Yang et al., 2023).
2. Multi-Task Tuning and Training Objectives
The fine-tuning protocol formalizes eight diverse mental health analysis tasks, including:
- Binary depression and loneliness detection
- Stress detection
- Multi-class disorder classification
- Cause/factor identification
- Wellness-dimension tagging
- Interpersonal risk factor detection
Each instance takes the format: "Post: … Question: …" with the model outputting: Answer: [Label]. Reasoning: [free-form explanation].
The training objective is next-token prediction (cross-entropy) over the combined answer and reasoning strings. Formally, for dataset , optimization maximizes
No auxiliary or consistency regularization is imposed; alignment relies explicitly on sequence likelihood maximization to elicit high-quality, explanatory outputs (Yang et al., 2023).
3. IMHI Dataset Construction and Explanation Curation
The IMHI dataset consists of 105,000 annotated instruction–response samples, aggregated from ten public social-media sources covering eight major analysis tasks. The included topics and splits are shown below:
| Task | Training / Validation / Test samples |
|---|---|
| Depression detection (DR, CLP) | 1,003 / 430 / 405 and 456 / 196 / 299 |
| Stress detection (Dreaddit) | 2,837 / 300 / 414 |
| Multi-class disorders (SWMH, T-SID) | 34,822 / 8,705 / 10,882 and 3,071 / 767 / 959 |
| Stress-cause (SAD) | 5,547 / 616 / 684 |
| Depression/suicide-cause (CAMS) | 2,207 / 320 / 625 |
| Loneliness detection | 2,463 / 527 / 531 |
| Wellness-dimension detection (MultiWD) | 15,744 / 1,500 / 2,441 |
| Interpersonal risk factor (IRF) | 3,943 / 985 / 2,113 |
To address the lack of explanatory rationales in source datasets, responses were generated via ChatGPT, prompted with task-specific templates and expertly-written few-shot examples (35 per task, 350 gold exemplars total). Explanation quality was strictly vetted through automated checks (BART-Score) and human evaluation (criteria: correctness, consistency, overall quality), ensuring high informational reliability and psychological validity (Yang et al., 2023).
4. Fine-Tuning Protocols and Implementation Details
All model variants (including MentaLLaMA-chat-13B) were fine-tuned for 10 epochs on the IMHI train split, with the following hyperparameter settings:
- Optimizer: AdamW
- Peak learning rate: (linear warmup over the first 3% of steps)
- No weight decay or explicit regularization
- Batch size: 32 sequences; gradient accumulation (effective batch: 256)
- Sequence length: truncated/padded to 2,048 tokens
- Flash-Attention activated for accelerated self-attention
- Hardware: four NVIDIA A100 GPUs (80 GB each)
- Fine-tuning runtime: approximately 48 hours for chat-13B
Inference adopts the LLaMA2-chat prompt convention, e.g.,
1 2 3 |
<s>[INST] <<SYS>> You are an assistant for mental health analysis. <</SYS> User: Post: "..." Question: "…?" [/INST] Assistant: |
Greedy decoding (temperature 0.0, no beam search) is used to yield concise, deterministic explanations (Yang et al., 2023).
5. Benchmark Performance and Explanation Quality
The evaluation uses the IMHI benchmark (10 test sets) to assess both prediction correctness and explanation quality. MentaLLaMA-chat-13B achieves weighted F1 scores matching or exceeding leading discriminative pretrained LLMs (PLMs) on 7 of 10 test sets. Selected F1 values:
| Task | Weighted F1 (%) |
|---|---|
| Depression Reddit (DR) | 85.7 |
| Stress detection | 75.8 |
| Interpersonal risk (IRF) | 76.5 |
| Wellness dimensions | 75.1 |
| CAMS cause detection | 45.5 |
| T-SID Twitter disorders | 75.3 |
Explanatory quality, as measured by automated BART-Score, consistently surpasses baselines including BART-large and T5-large, and MentaLLaMA-chat-13B outperforms its 7B counterpart by over 0.20 on eight sets. Human ratings (N=200) render mean scores (0–3 scale): Consistency (2.6), Reliability (2.4), Professionality (1.8), Overall (2.2). Consistency and reliability are on par with ChatGPT, though professionality is lower—this suggests further domain pretraining may be beneficial (Yang et al., 2023).
When four tasks were excluded from fine-tuning, MentaLLaMA-chat-13B surpassed zero-shot ChatGPT in F1 on three unseen tasks and delivered explanations superior (per BART-Score) to T5 and BART, indicating robust generalization within the mental health domain (Yang et al., 2023).
6. Task-Oriented Outputs and Interpretability
MentaLLaMA-chat-13B’s output format is designed for transparency. For example, given the prompt:
1 2 |
Post: I’ve been feeling like nothing I do matters. Some days I can’t even drag myself out of bed. Question: Does the poster suffer from depression? |
The model returns:
1 2 |
Answer: Yes. Reasoning: The user explicitly describes pervasive feelings of worthlessness (‘nothing I do matters’) and an inability to engage in normal daily activities (‘can’t even drag myself out of bed’). These are hallmark symptoms of major depressive episodes, indicating that the poster is experiencing clinical depression. |
This structured output elucidates the clinical reasoning underlying predictions, enhancing interpretability for both research and applied settings (Yang et al., 2023).
7. Implications and Future Directions
MentaLLaMA-chat-13B constitutes the first open-source LLM series for instruction-following, interpretable mental health analysis, balancing predictive accuracy and rationale quality. It addresses prior data and availability gaps by introducing an IMHI corpus and benchmarking approach, facilitating the development and evaluation of domain-specific explainable models. A plausible implication is that its release lowers barriers for research into fair, transparent, and robust mental health informatics on social media. Its approach—combining large-scale instruction-tuning, careful explanation curation, and robust evaluation—establishes a methodological blueprint for emerging specialization of LLMs in sensitive application domains (Yang et al., 2023).