Papers
Topics
Authors
Recent
Search
2000 character limit reached

Stochastic gradient descent estimation of generalized matrix factorization models with application to single-cell RNA sequencing data

Published 29 Dec 2024 in stat.CO and stat.ML | (2412.20509v1)

Abstract: Single-cell RNA sequencing allows the quantitation of gene expression at the individual cell level, enabling the study of cellular heterogeneity and gene expression dynamics. Dimensionality reduction is a common preprocessing step to simplify the visualization, clustering, and phenotypic characterization of samples. This step, often performed using principal component analysis or closely related methods, is challenging because of the size and complexity of the data. In this work, we present a generalized matrix factorization model assuming a general exponential dispersion family distribution and we show that many of the proposed approaches in the single-cell dimensionality reduction literature can be seen as special cases of this model. Furthermore, we propose a scalable adaptive stochastic gradient descent algorithm that allows us to estimate the model efficiently, enabling the analysis of millions of cells. Our contribution extends to introducing a novel warm start initialization method, designed to accelerate algorithm convergence and increase the precision of final estimates. Moreover, we discuss strategies for dealing with missing values and model selection. We benchmark the proposed algorithm through extensive numerical experiments against state-of-the-art methods and showcase its use in real-world biological applications. The proposed method systematically outperforms existing methods of both generalized and non-negative matrix factorization, demonstrating faster execution times while maintaining, or even enhancing, matrix reconstruction fidelity and accuracy in biological signal extraction. Finally, all the methods discussed here are implemented in an efficient open-source R package, sgdGMF, available at github/CristianCastiglione/sgdGMF

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 1 like about this paper.