Necessity of a 3D spatial latent representation for mental rotation
Determine whether rotation actions in the proposed mechanistic model of mental rotation must operate on the 3D spatial latent representation produced by the Equivariant Neural Renderer, or whether applying rotation actions directly to the neuro-symbolic sequence representation generated by the Vision Symbolic Model suffices to achieve accurate similarity judgments; specifically, ascertain if the 3D latent space is necessary for solving the Shepard–Metzler mental rotation task within this architecture.
References
We note that in our model the rotation actions could also have been directly applied to the symbolic representations, and we do not know whether the 3D latent space is fully needed.
— A Deep Learning Model of Mental Rotation Informed by Interactive VR Experiments
(2512.13517 - Khazoum et al., 15 Dec 2025) in Section 6 Discussion (Symbolic or Spatial representations?), page unspecified