Towards Rotation-only Imaging Geometry: Rotation Estimation

Published 16 Nov 2025 in cs.CV | (2511.12415v1)

Abstract: Structure from Motion (SfM) is a critical task in computer vision, aiming to recover the 3D scene structure and camera motion from a sequence of 2D images. The recent pose-only imaging geometry decouples 3D coordinates from camera poses and demonstrates significantly better SfM performance through pose adjustment. Continuing the pose-only perspective, this paper explores the critical relationship between the scene structures, rotation and translation. Notably, the translation can be expressed in terms of rotation, allowing us to condense the imaging geometry representation onto the rotation manifold. A rotation-only optimization framework based on reprojection error is proposed for both two-view and multi-view scenarios. The experiment results demonstrate superior accuracy and robustness performance over the current state-of-the-art rotation estimation methods, even comparable to multiple bundle adjustment iteration results. Hopefully, this work contributes to even more accurate, efficient and reliable 3D visual computing.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a rotation-only formulation that analytically expresses translation as a function of rotation, reducing optimization to the rotation manifold.
It employs rigorous degeneracy detection and Levenberg-Marquardt optimization, achieving up to 27.70% improvement over traditional multi-view methods.
Experimental results on real and simulated data highlight superior robustness and efficiency compared to conventional full bundle adjustment approaches.

Rotation-only Imaging Geometry for Robust Rotation Estimation

Introduction and Motivation

This work addresses one of the foundational challenges in computer vision and geometric scene understanding: robust estimation of camera rotation from image sequences, particularly within the context of Structure from Motion (SfM). The mainstream approaches to SfM, including bundle adjustment (BA), tightly couple the estimation of rotation, translation, and 3D scene structure, often introducing sensitivity to initialization and high-dimensional optimization challenges. Recent developments in pose-only imaging geometry have demonstrated that decoupling scene structure from pose can increase both the efficiency and robustness of SfM pipelines. This paper advances this decoupling to its logical extent by formulating a rotation-only imaging geometry—parametrizing imaging solely on the rotation manifold and analytically expressing translation as a function of rotation and observations. This approach eliminates dependence upon translation estimation for many core problems, enabling robust, efficient, and highly accurate rotation estimation in both two-view and multi-view scenarios.

Figure 1: An illustration of reconstruction using the Lund dataset; views form a connectivity graph where nodes include pose and observation information and edges encode inter-view constraints.

Theoretical Framework

Rotation-only Parametrization

The central contribution is the derivation of an analytical representation wherein translation is expressed in terms of observed points and rotations, enabling the refinement of rotations independently. By leveraging the pairwise pose-only (PPO) constraints, the geometry is condensed to a lower-dimensional subspace: the rotation manifold. Position information is marginalized, shifting the optimization to $\mathrm{SO}(3)$ , which dramatically reduces the parameter search space—as compared to conventional pose-plus-structure optimization.

Observability and Scene Structure Analysis

A rigorous theoretical analysis is provided to characterize the conditions under which translation is observable from the given configuration (rank analysis of the joint observation matrix). Three cases are identified:

PR/B/I (Pure Rotation/Baseline/Infinity): Cameras have pure rotational motion or all scene points are at infinity or collinear with the baseline; translation is unobservable.
Holoplane: All points and both cameras are coplanar; translation indeterminacy exists within the plane.
RankRegular: Generic, fully spatial scenes; translation is uniquely determined up to scale.

Key to robust processing, this classification enables early detection and exclusion of configurations—such as the RotationSingular cases—that can cause traditional methods to fail (see Table 1 in the paper).

Figure 2: $v_{\mathrm{rs}}$ detection value for identifying degenerate or singular scene configurations based on translation observability.

Figure 3: Detection values of $v_{\mathrm{rs}}$ highlight separation between RotationSingular and regular scenes, enabling robust exclusion.

Rotation Manifold Reprojection Error

Building upon this, the reprojection error is recast purely on the rotation manifold. The authors analytically derive that, under generic (RankRegular) conditions, the residual is invariant to the translation scale and sign. In degenerate scenes (PR/B/I, Holoplane), the constraints automatically collapse to the rotation manifold. This property enables precise, direct minimization of the rotation-induced reprojection error absent translation variables.

Figure 4: Generation of projected point $\boldsymbol{X}$ by reconstructing along the projection ray and projecting into the target view; this is central to the reprojection computation on the rotation manifold.

Figure 5: Generation mechanism of reprojection residuals $\boldsymbol{V}$ in the rotation-only framework, illustrating multi-view constraints aggregation.

Algorithmic Innovations

The proposed framework is realized concretely through a series of algorithms:

Degeneracy Detection: Fast algorithms to evaluate scene type and selectively exclude degenerate RotationSingular configurations.
Analytic Translation Calculation: Translation direction is extracted analytically (via matrix nullspace/eigenanalysis) from the joint observation matrix, parameterized by rotations and image points.
Levenberg-Marquardt Optimization: Both two-view and multi-view rotation estimation are realized via LM optimization solely over rotations on $\mathrm{SO}(3)$ .

The multi-view optimization leverages weighted reprojection consistency aggregated across the view graph, ensuring maximal utilization of geometric constraints with minimal parameter overhead.

Experimental Evaluation

Scene Identification

In extensive Monte Carlo simulations and real-world data (Strecha, Lund), the degeneracy identification method achieves 100% recognition rate, even under substantial noise (up to 10 pixels). This confirms the effectiveness of the low-rank and $v_{\mathrm{rs}}$ metrics in robustly excluding problematic configurations that cause classical estimators to fail or diverge.

Figure 6: Illustration of pure rotation scene; all cameras coincide spatially, exemplifying the PR/B/I degeneracy.

Two-view Rotation Estimation

Across various controlled settings (noise, point count, scene geometry) and challenging simulated as well as real datasets, the rotation-only approach (termed TRRM in the paper) consistently outperforms both classical and state-of-the-art competitors, including BA, PA, and methods of Kneip et al. Notably, the accuracy improvements are quantitatively significant:

Average improvement over next-best method: 17.01%
Maintains highest accuracy as observation noise, scene depth, or rotation amplitude increases.

Figure 7: Noise robustness: TRRM and PA maintain low error, outperforming alternatives as observation noise increases.

Figure 8: Circular motion multi-view scenario; cameras arranged on a circle with points above, used for multi-view experiments.

Multi-view Rotation Estimation

The proposed global rotation-only manifold method (GRRM) is compared with leading rotation averaging and BA variants. The algorithm achieves:

27.70% average improvement over the next-best multi-view rotation estimation competitor.
Accuracy comparable to four full rounds of BA as implemented in the OpenMVG pipeline, despite only optimizing over rotations, and requiring substantially lower computational resources.

Figure 9: Circular motion multi-view experiment; illustrates the challenging spatial configuration and the effectiveness of the GRRM approach.

Detailed experiments further show that the method is robust under sparse matching, dense point distributions, and is not affected by the decrease in the number of matched views, outperforming BA and PA under these scenarios.

Implications and Future Directions

Theoretical Implications

The analytical decoupling of translation from rotation in imaging geometry provides deeper insight into the intrinsic structure of multiview geometry, challenging the traditional epipolar-centric paradigm. The work strengthens the theoretical foundation for understanding when and how translation contributes to image formation constraints, and when it can be marginalized without loss of accuracy.

Practical Impact

The rotation-only formulation leads directly to:

Lower-dimensional optimization for robust initialization and refinement.
Superior accuracy and robustness in difficult (singular/near-singular) or noisy scenes.
Greatly reduced computational cost compared to full BA, enabling scalability to very large-scale reconstruction and rapid updates in real-time applications (e.g., SLAM, robotics, AR/VR).

Future Developments

This work lays the foundation for several directions:

Extension to dynamic or partially calibrated scenarios (varying intrinsics or rolling shutter).
Joint integration with semantic and temporal priors to further stabilize in challenging environments.
Applicability to distributed or federated SLAM/SfM frameworks exploiting the reduced parameterization.

Conclusion

By formalizing and exploiting the rotation-only structure of imaging geometry, this research provides a powerful, theoretically sound, and empirically validated framework for robust, efficient rotation estimation in geometric vision. The framework simultaneously detects and handles scene degeneracies, analytically eliminates translation, and delivers superior accuracy, even rivaling computationally intensive full bundle adjustment. This unlocks practical and theoretical advances in 3D vision, with immediate benefits for large-scale and real-time vision-based applications.

Reference: "Towards Rotation-only Imaging Geometry: Rotation Estimation" (2511.12415)

Markdown Report Issue

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Open Problems

We found no open problems mentioned in this paper.

Towards Rotation-only Imaging Geometry: Rotation Estimation

Summary

Rotation-only Imaging Geometry for Robust Rotation Estimation

Introduction and Motivation

Theoretical Framework

Rotation-only Parametrization

Observability and Scene Structure Analysis

Rotation Manifold Reprojection Error

Algorithmic Innovations

Experimental Evaluation

Scene Identification

Two-view Rotation Estimation

Multi-view Rotation Estimation

Implications and Future Directions

Theoretical Implications

Practical Impact

Future Developments

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (3)

Collections

Tweets

Towards Rotation-only Imaging Geometry: Rotation Estimation

Summary

Rotation-only Imaging Geometry for Robust Rotation Estimation

Introduction and Motivation

Theoretical Framework

Rotation-only Parametrization

Observability and Scene Structure Analysis

Rotation Manifold Reprojection Error

Algorithmic Innovations

Experimental Evaluation

Scene Identification

Two-view Rotation Estimation

Multi-view Rotation Estimation

Implications and Future Directions

Theoretical Implications

Practical Impact

Future Developments

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (3)

Collections

Tweets