Linear Distance Metric Learning with Noisy Labels
Abstract: In linear distance metric learning, we are given data in one Euclidean metric space and the goal is to find an appropriate linear map to another Euclidean metric space which respects certain distance conditions as much as possible. In this paper, we formalize a simple and elegant method which reduces to a general continuous convex loss optimization problem, and for different noise models we derive the corresponding loss functions. We show that even if the data is noisy, the ground truth linear metric can be learned with any precision provided access to enough samples, and we provide a corresponding sample complexity bound. Moreover, we present an effective way to truncate the learned model to a low-rank model that can provably maintain the accuracy in loss function and in parameters -- the first such results of this type. Several experimental observations on synthetic and real data sets support and inform our theoretical results.
- Airline Passenger Satisfaction, 2020. URL https://www.kaggle.com/datasets/teejmahal20/airline-passenger-satisfaction.
- Linear Distance Metric Learning. https://github.com/meysamalishahi/Linear-Distance-Metric-Learning, 2023. GitHub repository.
- Geometry of semidefinite max-cut relaxations via matrix ranks. Journal of Combinatorial Optimization, 6(3):237–270, 2002. doi: 10.1023/A:1014895808844. URL https://doi.org/10.1023/A:1014895808844.
- Improved guarantees for learning via similarity functions. In Rocco A. Servedio and Tong Zhang, editors, COLT, pages 287–298. Omnipress, 2008. URL http://dblp.uni-trier.de/db/conf/colt/colt2008.html#BalcanBS08.
- Online identification and tracking of subspaces from highly incomplete information. In 2010 48th Annual allerton conference on communication, control, and computing (Allerton), pages 704–711. IEEE, 2010.
- Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research, 6(32):937–965, 2005. URL http://jmlr.org/papers/v6/bar-hillel05a.html.
- Learning good edit similarities with generalization guarantees. In Dimitrios Gunopulos, Thomas Hofmann, Donato Malerba, and Michalis Vazirgiannis, editors, Machine Learning and Knowledge Discovery in Databases, pages 188–203, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
- Good edit similarity learning by loss minimization. Machine Learning, 89(1):5–35, 2012a. doi: 10.1007/s10994-012-5293-8. URL https://doi.org/10.1007/s10994-012-5293-8.
- Similarity learning for provably accurate sparse linear classification. In Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26 - July 1, 2012. icml.cc / Omnipress, 2012b. URL http://icml.cc/2012/papers/919.pdf.
- Metric Learning. Morgan & Claypool Publishers, 2015.
- Robustness and generalization for metric learning. Neurocomputing, 151:259–267, 2015. ISSN 0925-2312. doi: https://doi.org/10.1016/j.neucom.2014.09.044. URL https://www.sciencedirect.com/science/article/pii/S092523121401248X.
- A survey on metric learning for feature vectors and structured data. CoRR, abs/1306.6709, 2013. URL http://dblp.uni-trier.de/db/journals/corr/corr1306.html#BelletHS13.
- Learning a distance metric by empirical loss minimization. In Toby Walsh, editor, IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22, 2011, pages 1186–1191. IJCAI/AAAI, 2011. doi: 10.5591/978-1-57735-516-8/IJCAI11-202. URL https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-202.
- Constrained empirical risk minimization framework for distance metric learning. IEEE Transactions on Neural Networks and Learning Systems, 23(8):1194–1205, 2012. doi: 10.1109/TNNLS.2012.2198075.
- Stability and generalization. Journal of Machine Learning Research, 2(Mar):499–526, 2002. ISSN ISSN 1533-7928. URL http://www.jmlr.org/papers/v2/bousquet02a.html.
- David R. Brillinger. Information and Information Stability of Random Variables and Processes. Journal of the Royal Statistical Society Series C, 13(2):134–135, June 1964. doi: 10.2307/2985711. URL https://ideas.repec.org/a/bla/jorssc/v13y1964i2p134-135.html.
- Generalization bounds for metric and similarity learning. Machine Learning, 102(1):115–132, 2016. doi: 10.1007/s10994-015-5499-7. URL https://doi.org/10.1007/s10994-015-5499-7.
- Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol. 1, 2005. doi: 10.1109/CVPR.2005.202.
- John D. Cook. Upper and lower bounds for the normal distribution function. URL https://www.johndcook.com/blog/norm-dist-bounds/.
- Hyperbolic vision transformers: Combining improvements in metric learning. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7399–7409, Los Alamitos, CA, USA, jun 2022. IEEE Computer Society. doi: 10.1109/CVPR52688.2022.00726. URL https://doi.ieeecomputersociety.org/10.1109/CVPR52688.2022.00726.
- Neighbourhood components analysis. In L. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems, volume 17. MIT Press, 2004. URL https://proceedings.neurips.cc/paper_files/paper/2004/file/42fe880812925e520249e808937738d2-Paper.pdf.
- Is that you? metric learning approaches for face identification. In 2009 IEEE 12th International Conference on Computer Vision, pages 498–505, 2009. doi: 10.1109/ICCV.2009.5459197.
- Guaranteed Classification via Regularized Similarity Learning. Neural Computation, 26(3):497–522, 03 2014. ISSN 0899-7667. doi: 10.1162/NECO˙a˙00556. URL https://doi.org/10.1162/NECO_a_00556.
- Spitfire. https://github.com/sandialabs/Spitfire, 2020.
- Regularized distance metric learning:theory and algorithm. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc., 2009. URL https://proceedings.neurips.cc/paper_files/paper/2009/file/a666587afda6e89aec274a3657558a27-Paper.pdf.
- Low-rank optimization on the cone of positive semidefinite matrices. SIAM Journal on Optimization, 20(5):2327–2351, 2010. doi: 10.1137/080731359. URL https://doi.org/10.1137/080731359.
- Brian Kulis. Metric learning: A survey. Foundations and Trends® in Machine Learning, 5(4):287–364, 2013. ISSN 1935-8237. doi: 10.1561/2200000019. URL http://dx.doi.org/10.1561/2200000019.
- Eigenvalue optimization. Acta Numerica, 5:149–190, 1996. doi: 10.1017/S0962492900002646.
- Deep attribute-preserving metric learning for natural language object retrieval. In Proceedings of the 25th ACM international conference on Multimedia, pages 181–189, 2017.
- Fast low-rank metric learning for large-scale and high-dimensional data. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper_files/paper/2019/file/0d0fd7c6e093f7b804fa0150b875b868-Paper.pdf.
- Fisher discriminant analysis with kernels. In Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468), pages 41–48, 1999. doi: 10.1109/NNSP.1999.788121.
- A proposal of extended cosine measure for distance metric learning in text classification. In 2011 IEEE International Conference on Systems, Man, and Cybernetics, pages 1741–1746, 2011. doi: 10.1109/ICSMC.2011.6083923.
- Yadong Mu. Fixed-rank supervised metric learning on riemannian manifold. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1), Feb. 2016. doi: 10.1609/aaai.v30i1.10246. URL https://ojs.aaai.org/index.php/AAAI/article/view/10246.
- Learning feature engineering for classification. In IJCAI, volume 17, pages 2529–2535, 2017.
- Supervised distance metric learning through maximization of the jeffrey divergence. Pattern Recognition, 64:215–225, 2017. ISSN 0031-3203. doi: https://doi.org/10.1016/j.patcog.2016.11.010. URL https://www.sciencedirect.com/science/article/pii/S0031320316303600.
- Michael L. Overton. On minimizing the maximum eigenvalue of a symmetric matrix. SIAM Journal on Matrix Analysis and Applications, 9(2):256–268, 1988. doi: 10.1137/0609021. URL https://doi.org/10.1137/0609021.
- Recall@k surrogate loss with large batches and similarity mixup. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7492–7501, 2021.
- Hierarchical Average Precision Training for Pertinent Image Retrieval. In ECCV 2022, Tel-Aviv, Israel, October 2022. URL https://hal.science/hal-03712933.
- Nonlinear discriminant analysis using kernel functions. Advances in neural information processing systems, 12, 1999.
- A five-year (2015 to 2019) analysis of studies focused on breast cancer prediction using machine learning: A systematic review and bibliometric analysis. J Public Health Res, 9(1):1792, Jun 2020. ISSN 2279-9028 (Print); 2279-9036 (Electronic); 2279-9028 (Linking). doi: 10.4081/jphr.2020.1772.
- Hash kernels for structured data. Journal of Machine Learning Research, 10(11), 2009.
- Distance Metric Learning by Optimization on the Stiefel Manifold. 01 2015. doi: 10.5244/C.29.DIFFCV.7.
- Masashi Sugiyama. Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. Journal of Machine Learning Research, 8(37):1027–1061, 2007. URL http://jmlr.org/papers/v8/sugiyama07b.html.
- Combustion modeling using principal component analysis. Proceedings of the Combustion Institute, 32(1):1563–1570, 2009.
- Large margin component analysis. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems, volume 19. MIT Press, 2006. URL https://proceedings.neurips.cc/paper/2006/file/dc6a7e655d7e5840e66733e9ee67cc69-Paper.pdf.
- Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer Publishing Company, Incorporated, 1st edition, 2008. ISBN 0387790519.
- Bart Vandereycken. Low-rank matrix completion by riemannian optimization. SIAM Journal on Optimization, 23(2):1214–1236, 2013.
- Survey on distance metric learning and dimensionality reduction in data mining. Data Mining and Knowledge Discovery, 29(2):534–564, 2015.
- Feature extraction by maximizing the average neighborhood margin. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2007. doi: 10.1109/CVPR.2007.383124.
- Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(9):207–244, 2009. URL http://jmlr.org/papers/v10/weinberger09a.html.
- A feasible method for optimization with orthogonality constraints. Mathematical Programming, 142(1):397–434, 2013. doi: 10.1007/s10107-012-0584-1. URL https://doi.org/10.1007/s10107-012-0584-1.
- Breast Cancer Wisconsin (Diagnostic). UCI Machine Learning Repository, 1995. DOI: https://doi.org/10.24432/C5DW2B.
- Learning a mahalanobis distance metric for data clustering and classification. Pattern Recognition, 41(12):3600–3612, 2008. ISSN 0031-3203. doi: https://doi.org/10.1016/j.patcog.2008.05.018. URL https://www.sciencedirect.com/science/article/pii/S0031320308002057.
- Distance metric learning with application to clustering with side-information. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2002. URL https://proceedings.neurips.cc/paper_files/paper/2002/file/c3e4035af2a1cde9f21e1ae1951ac80b-Paper.pdf.
- Robustness and generalization. Machine Learning, 86(3):391–423, 2012. doi: 10.1007/s10994-011-5268-1. URL https://doi.org/10.1007/s10994-011-5268-1.
- Distance metric learning with eigenvalue optimization. The Journal of Machine Learning Research, 13:1–26, 2012.
- Manifold-informed state vector subset for reduced-order modeling. Proceedings of the Combustion Institute, 2022. ISSN 1540-7489. doi: https://doi.org/10.1016/j.proci.2022.06.019. URL https://www.sciencedirect.com/science/article/pii/S1540748922000153.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.