Constrained least squares simplicial-simplicial regression
Abstract: Simplicial-simplicial regression refers to the regression setting where both the responses and predictor variables lie within the simplex space, i.e. they are compositional. For this setting, constrained least squares, where the regression coefficients themselves lie within the simplex, is proposed. The model is transformation-free but the adoption of a power transformation is straightforward, it can treat more than one compositional datasets as predictors and offers the possibility of weights among the simplicial predictors. Among the model's advantages are its ability to treat zeros in a natural way and a highly computationally efficient algorithm to estimate its coefficients. Resampling based hypothesis testing procedures are employed regarding inference, such as linear independence, and equality of the regression coefficients to some pre-specified values. The strategy behind the formulation of the new model is implemented is related to an existing methodology, that is of the same spirit, showcasing how other similar models can be employed as well. Finally, the performance of the proposed technique and its comparison to the existing methodology takes place using simulation studies and real data examples.
- Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B 44(2), 139–177.
- Aitchison, J. (1983). Principal component analysis of compositional data. Biometrika 70(1), 57–65.
- Aitchison, J. (2003). The statistical analysis of compositional data. New Jersey: Reprinted by The Blackburn Press.
- Alenazi, A. (2019). Regression for compositional data with compositional data as predictor variables with or without zero values. Journal of Data Science 17(1), 219–237.
- Multiple linear regression with compositional response and covariates. Journal of Applied Statistics 44(12), 2270–2285.
- Non-parametric regression for compositional data. Statistical Modelling 15(2), 113–133.
- Statistical Shape Analysis. John Wiley & Sons.
- Isometric logratio transformations for compositional data analysis. Mathematical Geology 35(3), 279–300.
- codalm: Transformation-Free Linear Regression for Compositional Outcomes and Predictors. R package version 0.1.2.
- A transformation-free linear regression for compositional outcomes and predictors. Biometrics 78(3), 974–987.
- A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming 27(1), 1–33.
- Partial linear regression of compositional data. Journal of the Korean Statistical Society 51(4), 1090–1116.
- Higham, N. J. (2002). Computing the nearest correlation matrix—a problem from finance. IMA journal of Numerical Analysis 22(3), 329–343.
- Linear regression with compositional explanatory variables. Journal of Applied Statistics 39(5), 1115–1128.
- Penalized and constrained optimization: an application to high-dimensional website advertising. Journal of the American Statistical Association 115(529), 107–122.
- Jolliffe, I. T. (2005). Principal Component Analysis. New York: Springer-Verlag.
- Lancaster, H. (1965). The Helmert matrices. American Mathematical Monthly 72(1), 4–12.
- Multidimensional scaling of simplex shapes. Pattern Recognition 32(9), 1601–1613.
- Liew, C. K. (1976). Inequality constrained least-squares estimation. Journal of the American Statistical Association 71(355), 746–751.
- Kent feature embedding for classification of compositional data with zeros. Statistics and Computing 34(2), 1–17.
- Model-based replacement of rounded zeros in compositional data: Classical and robust approaches. Computational Statistics & Data Analysis 56(9), 2688–2704.
- Using synthetic farm data to estimate individual nitrate leaching levels. Technical report, Department of Economics, University of Crete.
- Mullahy, J. (2015). Multivariate fractional regression estimation of econometric share models. Journal of Econometric Methods 4(1), 71–100.
- Murteira, J. M. R. and J. J. S. Ramalho (2016). Regression analysis of multivariate fractional data. Econometric Reviews 35(4), 515–552.
- Owen, A. B. (2001). Empirical likelihood. Chapman and Hall/CRC.
- Rfast2: A Collection of Efficient and Extremely Fast R Functions II. R package version 0.1.5.1.
- Displaying a clustering with CLUSPLOT. Computational Statistics & Data Analysis 30(4), 381–392.
- Scealy, J. and A. H. Welsh (2014). Fitting Kent models to compositional data with small concentration. Statistics and Computing 24, 165–179.
- robCompositions: Compositional Data Analysis. R package version 2.4.1.
- Tsagris, M. (2014). The k-NN algorithm for compositional data: a revised approach with and without zero values present. Journal of Data Science 12(3), 519–534.
- Tsagris, M. (2015a). A novel, divergence based, regression for compositional data. In Proceedings of the 28th Panhellenic Statistics Conference, April, Athens, Greece.
- Tsagris, M. (2015b). Regression analysis with compositional data containing zero values. Chilean Journal of Statistics 6(2), 47–57.
- Flexible non-parametric regression models for compositional response data with zeros. Statistics and Computing 33(5), 106.
- Compositional: Compositional Data Analysis. R package version 6.7.
- A data-based power transformation for compositional data. In Proceedings of the 4rth Compositional Data Analysis Workshop, Girona, Spain.
- Improved classification for compositional data using the α𝛼\alphaitalic_α-transformation. Journal of Classification 33(2), 243–261.
- Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation 87(2), 406–422.
- A folded model for compositional data analysis. Australian & New Zealand Journal of Statistics 62(2), 249–277.
- A Review of Flexible Transformations for Modeling Compositional Data, pp. 225–234. Cham: Springer International Publishing.
- quadprog: Functions to Solve Quadratic Programming Problems. R package version 1.5-8.
- Minimum volume ellipsoid. Wiley Interdisciplinary Reviews: Computational Statistics 1(1), 71–82.
- Regression modelling analysis on compositional data. In Handbook of Partial Least Squares: Concepts, Methods and Applications, pp. 381–406. Springer.
- Multiple linear regression modeling for compositional data. Neurocomputing 122, 490–500.
- Wets, R. J. B. (1991). Constrained estimation: Consistency and asymptotics. Applied Stochastic Models and Data Analysis 7(1), 17–32.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.