A Theory of the NEPv Approach for Optimization On the Stiefel Manifold
Abstract: The NEPv approach has been increasingly used lately for optimization on the Stiefel manifold arising from machine learning. General speaking, the approach first turns the first order optimality condition, also known as the KKT condition, into a nonlinear eigenvalue problem with eigenvector dependency (NEPv) or a nonlinear polar decomposition with orthogonal factor dependency (NPDo) and then solve the nonlinear problem via some variations of the self-consistent-field (SCF) iteration. The difficulty, however, lies in designing a proper SCF iteration so that a maximizer is found at the end. Currently, each use of the approach is very much individualized, especially in its convergence analysis to show that the approach does work or otherwise. In this paper, a unifying framework is established. The framework is built upon some basic assumptions. If the basic assumptions are satisfied, globally convergence is guaranteed to a stationary point and during the SCF iterative process that leads to the stationary point, the objective function increases monotonically. Also a notion of atomic functions is proposed, which include commonly used matrix traces of linear and quadratic forms as special ones. It is shown that the basic assumptions are satisfied by atomic functions and by convex compositions of atomic functions. Together they provide a large collection of objectives for which the NEPv/NPDo approach is guaranteed to work.
- Optimization Algorithms On Matrix Manifolds. Princeton University Press, Princeton, NJ, 2008.
- LAPACK Users’ Guide. SIAM, Philadelphia, 3rd edition, 1999.
- Templates for the solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia, 2000.
- Sharp estimation of convergence rate for self-consistent field iteration to solve eigenvector-dependent nonlinear eigenvalue problems. SIAM J. Matrix Anal. Appl., 43(1):301–327, 2022.
- Z. Bai and D. Lu. Variational characterization of monotone nonlinear eigenvector problems and geometry of self-consistent-field iteration. SIAM J. Matrix Anal. Appl., 46(1):84–111, 2024.
- Some global optimization problems on Stiefel manifolds. J. Global Optim., 30:91–101, 2004.
- P. Benner and X. Liang. Convergence analysis of vector extended locally optimal block preconditioned extended conjugate gradient method for computing extreme eigenvalues. Numer. Linear Algebra Appl., 29(6):e2445, 2022. 24 pages.
- R. Bhatia. Matrix Analysis. Graduate Texts in Mathematics, vol. 169. Springer, New York, 1996.
- First order optimality conditions and steepest descent algorithm on orthogonal Stiefel manifolds. Opt. Lett., 13:1773–1791, 2019.
- Extrema of sums of heterogeneous quadratic forms. Linear Algebra Appl., 269(1):331–365, 1998.
- I. Borg and J. Lingoes. Multidimensional Similarity Structure Analysis. Springer-Verlag, New York, 1987.
- Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learning Res., 15(42):1455–1459, 2014.
- S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, UK, 2004.
- On an eigenvector-dependent nonlinear eigenvalue problem. SIAM J. Matrix Anal. Appl., 39(3):1360–1382, 2018.
- The orthogonally constrained regression revisited. J. Comput. Graph. Stat., 10(4):746–771, 2001.
- J. P. Cunningham and Z. Ghahramani. Linear dimensionality reduction: Survey, insights, and generalizations. J. Mach. Learning Res., 16:2859–2900, 2015.
- J. Demmel. Applied Numerical Linear Algebra. SIAM, Philadelphia, PA, 1997.
- The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl., 20(2):303–353, 1999.
- L. Eldén and H. Park. A procrustes problem on the Stiefel manifold. Numer. Math., 82:599–619, 1999.
- K. Fan. On a theorem of Weyl concerning eigenvalues of linear transformations. I. Proc. Natl. Acad. Sci. USA, 35(11):pp. 652–655, 1949.
- R. A. Fisher. The use of multiple measurements in taxonomic problems. Ann. Eugenics, 7(2):179–188, 1936.
- A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim., 28(1):302–332, 2018.
- G. Golub and Q. Ye. An inverse free preconditioned Krylov subspace methods for symmetric eigenvalue problems. SIAM J. Sci. Comput., 24:312–334, 2002.
- G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, Maryland, 4th edition, 2013.
- Procrustes Problems. Oxford University Press, New York, 2004.
- N. J. Higham. Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.
- P. Hohenberg and W. Kohn. Inhomogeneous electron gas. Phys. Rev., 136:B864–B871, 1964.
- Topics in Matrix Analysis. Cambridge University Press, Cambridge, 1991.
- Matrix Analysis. Cambridge University Press, New York, NY, 2nd edition, 2013.
- The Procrustes program: producing direct rotation to test a hypothesized factor structure. Comput. Behav. Sci., 7:258–262, 1962.
- Locally optimal and heavy ball GMRES methods. Japan J. Indust. Appl. Math., 33:471–499, 2016.
- C. Kanzow and H.-D. Qi. A QP-free constrained Newton-type method for variational inequality problems. Math. Program., 85:81–106, 1999.
- A. V. Knyazev. Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method. SIAM J. Sci. Comput., 23(2):517–541, 2001.
- W. Kohn and L. J. Sham. Self-consistent equations including exchange and correlation effects. Phys. Rev., 140:A1133–A1138, 1965.
- J. Kovač-Striko and K. Veselić. Some remarks on the spectra of Hermitian matrices. Linear Algebra Appl., 145:221–229, 1991.
- ARPACK User’s Guide. SIAM, Philadelphia, 1998.
- R.-C. Li. A perturbation bound for the generalized polar decomposition. BIT, 33:304–308, 1993.
- R.-C. Li. New perturbation bounds for the unitary polar factor. SIAM J. Matrix Anal. Appl., 16:327–332, 1995.
- R.-C. Li. Accuracy of computed eigenvectors via optimizing a Rayleigh quotient. BIT, 44(3):585–593, 2004.
- R.-C. Li. Matrix perturbation theory. In L. Hogben, R. Brualdi, and G. W. Stewart, editors, Handbook of Linear Algebra, page Chapter 21. CRC Press, Boca Raton, FL, 2nd edition, 2014.
- R.-C. Li. Rayleigh quotient based optimization methods for eigenvalue problems. In Z. Bai, Weiguo Gao, and Yangfeng Su, editors, Matrix Functions and Matrix Equations, volume 19 of Series in Contemporary Applied Mathematics, pages 76–108. World Scientific, Singapore, 2015.
- X. Liang and R.-C. Li. On generalizing trace minimization principles, II. Linear Algebra Appl., 687:8–37, 2024.
- Trace minimization principles for positive semi-definite pencils. Linear Algebra Appl., 438:3085–3106, 2013.
- On generalizing trace minimization principles. Linear Algebra Appl., 656:483–509, 2023.
- Maximization of matrix trace function of product Stiefel manifolds. SIAM J. Matrix Anal. Appl., 36(4):1489–1506, 2015.
- D. Lu and R.-C. Li. Locally unitarily invariantizable NEPv and convergence analysis of SCF. Math. Comp., 2024. Published electronically: January 9, 2024.
- J. Moré and D. Sorensen. Computing a trust region step. SIAM J. Sci. Statist. Comput., 4(3):553–572, 1983.
- The trace ratio optimization problem for dimensionality reduction. SIAM J. Matrix Anal. Appl., 31(5):2950–2971, 2010.
- A generalized power iteration method for solving quadratic problem on the Stiefel manifold. SCIENCE CHINA Info. Sci., 60(11):1–10, 2017.
- J. Nocedal and S. Wright. Numerical Optimization. Springer, 2nd edition, 2006.
- B. N. Parlett. The Symmetric Eigenvalue Problem. SIAM, Philadelphia, 1998. This SIAM edition is an unabridged, corrected reproduction of the work first published by Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1980.
- B. T. Polyak. Introduction to Optimization. Optimization Software, New York, 1987.
- P. Quillen and Q. Ye. A block inverse-free preconditioned Krylov subspace method for symmetric generalized eigenvalue problems. J. Comput. Appl. Math., 233(5):1298–1313, 2010.
- T. Rapcsák. On minimization on Stiefel manifolds. European J. Oper. Res., 143(2):365–376, 2002.
- J. D. Rutter. A serial implementation of Cuppen’s divide and conquer algorithm for the symmetric eigenvalue problem. Technical Report UCB/CSD-94-799, EECS Department, University of California, Berkeley, February 1994.
- Y. Saad. Numerical Methods for Large Eigenvalue Problems. Manchester University Press, Manchester, UK, 1992.
- Numerical methods for electronic structure calculations of materials. SIAM Rev., 52(1):3–54, 2010.
- G. W. Stewart and J. G. Sun. Matrix Perturbation Theory. Academic Press, Boston, 1990.
- J. G. Sun. Matrix Perturbation Analysis. Graduate Texts (Academia, Sinica). Science Publisher, Beijing, 2nd edition, November 2001. in Chinese.
- I. Takahashi. A note on the conjugate gradient method. Inform. Process. Japan, 5:45–49, 1965.
- J. M. F. Ten Berge. Generalized approaches to the MAXBET problem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika, 53(4):487–494, 1984.
- Z. Teng and R.-C. Li. Variations of orthonormal basis matrices of subspaces. Numer. Alg., Contr. Optim., 2024. to appear.
- J. P. Van de Geer. Linear relations among k𝑘kitalic_k sets of variables. Psychometrika, 49(1):70–94, 1984.
- J. von Neumann. Some matrix-inequalities and metrization of matrix-space. Tomck. Univ. Rev., 1:286–300, 1937.
- Multipliers correction methods for optimization problems over the Stiefel manifold. CSIAM Trans. Appl. Math., 2(3):508–531, 2021.
- Maximizing sum of coupled traces with applications. Numer. Math., 152:587–629, 2022. doi.org/10.1007/s00211-022-01322-y.
- Trace ratio optimization with an application to multi-view learning. Math. Program., 201:97–131, 2023. doi.org/10.1007/s10107-022-01900-w.
- H. F. Weinberger. Variational Methods for Eigenvalue Approximation, volume 15 of CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, Philadelphia, 1974.
- Z. Wen and W. Yin. A feasible method for optimization with orthogonality constraints. Math. Program., 142(1-2):397–434, 2013.
- KSSOLV—a MATLAB toolbox for solving the Kohn-Sham equations. ACM Trans. Math. Software, 36(2):1–35, 2009.
- M. Yang and R.-C. Li. Heavy ball flexible GMRES method for nonsymmetric linear systems. J. Comp. Math., 40(5):715–731, 2021.
- Fast algorithms for the generalized Foley-Sammon discriminant analysis. SIAM J. Matrix Anal. Appl., 31(4):1584–1605, 2010.
- Superlinear convergence of a general algorithm for the generalized Foley-Sammon discriminant analysis. J. Optim. Theory Appl., 157(3):853–865, 2013.
- An eigenvalue-based method for the unbalanced Procrustes problem. SIAM J. Matrix Anal. Appl., 41(3):957–983, 2020.
- L.-H. Zhang and R.-C. Li. Maximization of the sum of the trace ratio on the Stiefel manifold, I: Theory. SCIENCE CHINA Math., 57(12):2495–2508, 2014.
- L.-H. Zhang and R.-C. Li. Maximization of the sum of the trace ratio on the Stiefel manifold, II: Computation. SCIENCE CHINA Math., 58(7):1549–1566, 2015.
- A self-consistent-field iteration for orthogonal canonical correlation analysis. IEEE Trans. Pattern Anal. Mach. Intell., 44(2):890–904, 2022.
- Z. Zhang and K. Du. Successive projection method for solving the unbalanced Procrustes problem. SCIENCE CHINA Math., 49(7):971–986, 2006.
- Uniform projection for multi-view learning. IEEE Trans. Pattern Anal. Mach. Intell., 39(8):1675–1689, 2017.
- Orthogonal least squares regression for feature extraction. Neurocomputing, 216:200–207, 2016.
- Y. Zhou and R.-C. Li. Bounding the spectrum of large Hermitian matrices. Linear Algebra Appl., 435:480–493, 2011.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.