Representational Dissimilarity Matrices (RDMs)
- Representational Dissimilarity Matrices (RDMs) are defined as summary statistics that capture all pairwise dissimilarities in high-dimensional activation patterns using metrics like Euclidean and Mahalanobis.
- They are constructed through a systematic pipeline involving stimulus selection, activation extraction, normalization, and pairwise dissimilarity computation, often employing cross-validated estimators.
- RDMs play a crucial role in representational similarity analysis, facilitating quantitative comparisons between neural data and deep network models, and supporting dynamic visualization techniques like MDS.
A representational dissimilarity matrix (RDM) is a fundamental summary statistic in neuroscience and machine learning that captures the geometry of high-dimensional activation patterns by encoding all pairwise dissimilarities among a set of stimuli in a given representational space. RDMs provide an invariant, symmetric, and compact abstraction, facilitating rigorous comparison of neural population codes, deep network layers, brain regions, and models.
1. Formal Definition and Core Properties
Given stimuli %%%%1%%%% and their -dimensional response vectors (from neurons, voxels, or DNN units), an RDM is the matrix with entries
where is a suitable metric or pseudo-metric. Common choices include:
- Euclidean distance:
- Squared Euclidean distance:
- Correlation distance:
- Mahalanobis distance/cross-validated Mahalanobis ("crossnobis")
Key properties of RDMs:
- is symmetric ()
- Zero diagonal:
- Non-negative entries
- Invariance to invertible linear transforms or orthogonal rotations for specific choices (Euclidean, crossnobis, etc.)
- Channel-independence: abstracts from neuronal or voxel identity (McClure et al., 2015, Lin et al., 2023, Kriegeskorte et al., 2016)
RDMs encode the full pairwise discriminability structure, efficiently summarizing the representational geometry of the activation space.
2. Construction and Computation Methodologies
The standard RDM construction pipeline involves:
- Stimulus selection: Enumerate a set .
- Activation extraction: For each , extract response vector (e.g., DNN activations, voxel patterns).
- Preprocessing: Often mean-centering per channel, optional variance normalization or L2-normalization (Bersch et al., 2022).
- Pairwise dissimilarity computation: For all , calculate . Apply symmetry, set diagonal to zero.
- Vectorization: For later comparison, RDMs are typically vectorized by extracting the upper triangle .
Computation can be specialized:
- Cross-validated estimators for unbiased distance (e.g., crossnobis): suppresses measurement-noise bias (Diedrichsen et al., 2020, Kriegeskorte et al., 2016).
- Dynamic RDMs: Calculated at multiple time points across trials (e.g., MEG/fMRI chronometry) to capture representational dynamics (Lin et al., 2019).
- Geometric imputation for incomplete RDMs: missing entries inferred using triangle inequalities and anchor-based median-of-bounds estimation (Moerel et al., 31 May 2025).
- Stochastic RDMs: For stochastic representations, generalize distances via Wasserstein or energy distances between distributions over activations (Duong et al., 2022).
3. Role in Representational Similarity Analysis (RSA) and Model Comparison
RDMs underpin representational similarity analysis, functioning as the core second-order statistic for quantitative comparison:
- RSA workflow: Compute RDMs for each representational space (brain region, model layer, system) and compare vectorized RDMs using similarity metrics (Pearson/Spearman correlation, cosine similarity, WUC) (McClure et al., 2015, Blanchard et al., 2018, Diedrichsen et al., 2020).
- Measurement invariance: RDM comparisons are robust to reordering or rescaling of channels; contrasts only depend on representational geometry.
- Noise ceilings and statistical power: Estimation of attainable similarity between RDMs is bounded by measurement noise; advanced inference leverages cross-validation and whitening of error covariance (Diedrichsen et al., 2020).
- Multi-model explanations: Weighted RSA attempts to explain a target RDM as a convex combination of several candidate model RDMs (Bersch et al., 2022).
4. Extensions: Geometric, Topological, and Stochastic RDMs
RDM frameworks have been generalized to address RDM sensitivity and specificity limitations:
- Geo-topological matrices (RGTMs): Transform the RDM via a piecewise-linear function to compress noise-sensitive (small) and idiosyncratic (large) distances, focusing on the representational "neighborhood graph" (topology) rather than fine metric geometry. Varying transformation parameters interpolates between pure metric (geometry) and adjacency (topology) representations (Lin et al., 2023).
- Representational geodesic-distance matrices (RGDMs): Compute shortest-path distances on the induced neighborhood graph, emphasizing topological connectivity among stimuli.
- Stochastic RDMs: For systems with trial-to-trial variability, RDMs are calculated using distributions (e.g., mean and covariance) of responses, with the primary distance metrics generalized to Wasserstein or energy distances. This accounts for representational uncertainty and noise structure beyond trial-averaged differences (Duong et al., 2022).
These extensions enhance robustness to intersubject variance, measurement noise, or functional irrelevance of global metric features, and enable the probing of representational topology alongside geometry.
5. RDMs in Deep Learning and Cognitive Neuroscience
RDMs are widely deployed both in artificial neural network analysis and human/animal cognitive neuroscience:
- Neural networks: RDMs are computed for internal layers across batches to track representational evolution, guide transfer learning (e.g., representational distance learning), or benchmark against neural data (McClure et al., 2015, Jacob et al., 2020).
- Transfer learning: Auxiliary RDM-matching losses encourage a student network to align its intermediate representations with those of a teacher model or biological system, with demonstrated improvements in generalization, class clustering, and metric structure (McClure et al., 2015).
- Cognitive neuroscience: RDMs extracted from voxel or neural data—typically using crossnobis (to offset measurement noise)—enable channel-invariant comparison of representational geometries across subjects, species, and models (Kriegeskorte et al., 2016).
- Toolboxes: Applications include Net2Brain, which supports constructing, normalizing, and comparing hundreds of DNN/brain RDMs with both standard and weighted RSA and region- or searchlight-based analyses (Bersch et al., 2022).
RDM-based comparisons reveal the match between computational models and empirical neural data, inform architecture search, and can be directly correlated with behavioral and task performance (Blanchard et al., 2018).
6. Visualization and Dynamic Analysis of RDMs
Visualization frameworks facilitate intuitive and rigorous exploration of RDM-encoded geometry:
- RDM heatmaps: Block structures visualize category clustering and class separation.
- Multidimensional Scaling (MDS): MDS produces low-dimensional embeddings that best preserve RDM distances; Procrustes alignment (pMDS) enables the comparison and animation of the evolution of representational geometries over time, forming "RDM movies" (Lin et al., 2019).
- Dynamic representational trajectories: Category divergence and convergence in aligned MDS space reveals temporal dynamics, hierarchical separation, and recurrent neural processing in biological systems.
These methods provide quantitative and qualitative access to the trial-level evolution and category structure of representations.
7. Limitations, Technical Challenges, and Ongoing Developments
While RDMs distill relevant aspects of representational geometry, several challenges and limitations arise:
- Noise bias and dependencies: Measurement noise systematically inflates dissimilarities; unbiased, cross-validated estimators are preferred. Dependencies among pairwise distances (covariance structure) undermine the validity of naive similarity metrics, motivating whitening approaches (e.g., WUC) (Diedrichsen et al., 2020, Kriegeskorte et al., 2016).
- Incomplete RDMs: In tasks relying on behavioral pairwise judgments or high-dimensional time-varying data, RDMs may be sparse; geometric reconstruction algorithms fill missing entries via triangle-based median-of-bounds estimation, subject to Euclidean embedding assumptions (Moerel et al., 31 May 2025).
- Metric selection: Choice of distance metric (Euclidean, correlation, Mahalanobis, probabilistic) determines invariances and sensitivity; topological or stochastic RDM variants introduce further trade-offs between robustness and discriminative power (Lin et al., 2023, Duong et al., 2022).
- Inter-stimulus dependency: Global geometry encoded in high-dimensional RDMs depends on the set of stimuli; inclusion/removal of stimuli can alter shortest-path distances or overall structure (Lin et al., 2023).
- Interpretability: In stochastic metric/stochastic shape frameworks, the interplay between mean-based and covariance-based differences can complicate direct interpretability, yet yields richer sensitivity to underlying representational organization (Duong et al., 2022).
Continuing research investigates the optimal matching of metric, topological, and probabilistic summary statistics to scientific objectives in model evaluation, brain-to-model comparison, and dynamic neural analyses.