Generalization of MMM mediation parameters to unseen samples

Determine whether the mediation parameter estimates—specifically the estimated exposure-to-mediator matrix (α̂) and mediator-to-outcome matrix (β̂) produced by the many-to-many-to-many (MMM) multivariate linear structural equation model—generalize to previously unseen samples so that they can be reliably used for out-of-sample prediction of multivariate outcomes.

Background

The paper introduces a many-to-many-to-many (MMM) mediation framework in which exposures (X), mediators (M), and outcomes (Y) are all multivariate and potentially high-dimensional. Estimation proceeds via a regularized approach to obtain coefficient matrices linking exposures to mediators (α), mediators to outcomes (β), and direct effects (γ), with the indirect-effect matrix given by αβ.

A core practical question is whether the estimated mediation parameters, obtained from training data that include exposures, mediators, and outcomes, will generalize to new subjects where only exposures (and covariates) may be available. This concerns the reproducibility and predictive utility of the estimated pathways (α̂, β̂) in out-of-sample settings, which the authors flag as initially unclear.

References

Whereas the estimated parameters of $\hat{\bm{\alpha}$ and $\hat{\bm{\beta}$ may provide insights into feature selection (given the high dimensions of both $x$ and $m$ are both high), it is unclear whether the estimated mediation parameters generalize to previously unseen samples.

High-dimensional Many-to-many-to-many Mediation Analysis  (2604.02886 - Nguyen et al., 3 Apr 2026) in Section 2.2 (The model)