CPI+DMC: Modular Early Risk Detection
- CPI+DMC is a modular framework that splits early risk detection into Classification with Partial Information (CPI) and a Decision-Making Component (DMC) to optimize prediction timing.
- It balances prediction accuracy with alert earliness by independently tuning the CPI's probability estimations and the DMC’s stopping rules.
- Empirical results from tasks like eRisk 2024 and MentalRiskES 2025 validate its state-of-the-art performance through metrics such as ERDE and Macro F1 scores.
The CPI+DMC approach is a modular two-stage framework for @@@@1@@@@ over temporally unfolding data, originally motivated by Early Risk Detection (ERD) problems such as predicting the onset of mental health conditions from users’ social media streams. It partitions the detection process into a Classification with Partial Information (CPI) stage, which generates an updated probability of a target class as evidence accrues, and a Decision-Making Component (DMC), which determines the optimal stopping time to issue an alert. The paradigm enables principled trade-offs between prediction accuracy and detection earliness and allows for independent optimization, deployment, and analysis of the classification and decision rules. This separation of concerns has been empirically validated on recent ERD benchmarks, and has led to state-of-the-art results in shared tasks including eRisk 2024 and MentalRiskES 2025 (Thompson et al., 2024, Thompson et al., 28 Nov 2025).
1. Conceptual Framework and Motivation
CPI+DMC decomposes the ERD pipeline into two distinct components:
- Classification with Partial Information (CPI): At each time step (after observing user posts), CPI outputs a score for user . CPI is trained to maximize discriminative accuracy under incomplete, evolving input.
- Decision-Making Component (DMC): Receiving the stream of CPI scores, DMC applies an online policy—a (typically non-decreasing) function of or other derived statistics—to declare at what to make an irreversible prediction (issue an alert), explicitly balancing the trade-off between accuracy and earliness.
The approach is motivated by the structurally antagonistic objectives inherent in ERD: achieving a high true positive rate (CPI) and minimizing the delay before a correct alert is issued (DMC). The modular design enables independent design and tuning of each process to explore and navigate the Pareto frontier in the accuracy-time plane (Thompson et al., 28 Nov 2025).
2. Mathematical Formulation of CPI+DMC
Let index users and their chronologically ordered posts.
CPI: For each delay , CPI computeswhere indicates high risk.
DMC: DMC implements a stopping rule to decide the earliest at which to make a prediction . Common policy families include:
- Threshold-count policy (BERT/SBERT): Declare at the first where with fixed threshold and integer counter .
- Adaptive global policy (SS3): Compute per-user global values and , transform to softmax score , and compare to a distributional threshold based on the batch’s median and median absolute deviation (MAD) (Thompson et al., 28 Nov 2025).
3. Implementation: Model Variants and Decision Policies
CPI Instantiations
| CPI Model | Feature Set | Training Objective |
|---|---|---|
| SS3 | Character trigram frequencies, global-value function | Static F |
| BERT (with domain tokens) | Sliding post windows, extended WordPiece vocabulary | Static F |
| SBERT-SetFit | Sentence-pair contrastive embeddings | Macro F |
DMC Policies
| Policy Name | Mechanism | Tuning Parameter(s) |
|---|---|---|
| Global (SS3) | Adaptive threshold via MAD | |
| History-based | Repeated high-confidence count |
DMC policies are selected for their responsiveness to different operational imperatives. The global policy exploits inter-user score distributions for cohort-relative detection; the history-based rule enforces temporal smoothing and consistency in evidence (Thompson et al., 28 Nov 2025).
4. Metrics and Training Regimes
CPI is evaluated by static or dynamic F, accuracy, or macro/micro-averaged scores.
DMC is evaluated using time-aware metrics penalizing late detection. The ERDE family is canonical:
and related quantities such as and mean latency of true positives (Thompson et al., 2024).
Some models (e.g., the time-aware BERT variant in eRisk 2024) directly optimize ERDE (with a weighting term alongside cross-entropy) via backpropagation, using explicit temporal tokens (“[TIME]”) in the input to encode progression through the stream (Thompson et al., 2024).
5. Empirical Results and Trade-offs
CPI+DMC was the foundation for the UNSL group’s submissions in eRisk 2024 (anorexia detection) and MentalRiskES 2025 (gambling disorder) shared tasks.
- In eRisk 2024, two-stage CPI+DMC architectures and a time-aware variant achieved second place in the official ERDE metric, with ERDE versus a system-wide mean of 0.07, and consistently placed at or near the top in ranking-based metrics (P@10, NDCG@10/100) (Thompson et al., 2024).
- In MentalRiskES 2025, two of three CPI+DMC instantiations (SS3+Global, SBERT+History) occupied the top positions for Macro F (0.567 and 0.563 respectively) out of 38 submissions, displaying complementary strengths: SS3 favored earliness at a cost of higher false positives, while SBERT achieved the highest overall F at slightly later alert times (Thompson et al., 28 Nov 2025).
These results support the practical value of optimizing CPI and DMC independently, enabling flexible adaptation to mission requirements (earliness vs. certainty).
6. Strengths, Limitations, and Extensions
Strengths:
- Explicit modularity enables independent optimization, plug-and-play combinations, and systematic tuning of the accuracy-timeliness trade-off (Thompson et al., 28 Nov 2025).
- Both simple (SS3) and parametric (transformer) classifiers can be paired with decision policies tailored for responsiveness or conservatism.
- Empirical evidence demonstrates upper-bound state-of-the-art performance on several ERD tasks.
Limitations:
- The strict separation may forego optimal solutions in cases where joint end-to-end training can yield better performance; current variants do not jointly train CPI and DMC except for specialized time-aware models which inject earliness into the loss (Thompson et al., 2024).
- Discriminative power is limited by the lexical and semantic overlap between class labels in some corpora, leading to ambiguous or noisy predictions (Thompson et al., 28 Nov 2025).
- Current metrics may inadequately capture “borderline” cases; adaptive and per-user policies are signaled as an important future direction.
Future research is indicated in adaptive metrics/policies (e.g., per-user ERDE), rich expert-grounded reasoning for model interpretability, and joint optimization of CPI and DMC objectives (Thompson et al., 28 Nov 2025).
7. Applications and Outlook
While originally targeted at sequential health risk detection in natural language user streams, the CPI+DMC approach is generalizable to any setting requiring online decision-making with streaming, incomplete data. The paradigm’s modularity, principled metrics, and success in competitive evaluations establish it as a standard protocol for early detection tasks in computational social science and clinical informatics. A plausible implication is that further extensions—incorporating more sophisticated policies, adaptive user-specific thresholds, or chain-of-thought reasoning—will incrementally advance both the interpretability and reliability of such early warning systems (Thompson et al., 2024, Thompson et al., 28 Nov 2025).