Scalable Higher-Order Tensor Product Spline Models
Abstract: In the current era of vast data and transparent machine learning, it is essential for techniques to operate at a large scale while providing a clear mathematical comprehension of the internal workings of the method. Although there already exist interpretable semi-parametric regression methods for large-scale applications that take into account non-linearity in the data, the complexity of the models is still often limited. One of the main challenges is the absence of interactions in these models, which are left out for the sake of better interpretability but also due to impractical computational costs. To overcome this limitation, we propose a new approach using a factorization method to derive a highly scalable higher-order tensor product spline model. Our method allows for the incorporation of all (higher-order) interactions of non-linear feature effects while having computational costs proportional to a model without interactions. We further develop a meaningful penalization scheme and examine the induced optimization problem. We conclude by evaluating the predictive and estimation performance of our method.
- Tensorflow: A system for large-scale machine learning. In 12th {normal-{\{{USENIX}normal-}\}} symposium on operating systems design and implementation ({normal-{\{{OSDI}normal-}\}} 16), pages 265–283, 2016.
- Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6):734–749, 2005.
- Convex factorization machines. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 19–35. Springer, 2015.
- Higher-order factorization machines. In Advances in Neural Information Processing Systems, pages 3351–3359, 2016.
- Boosting functional regression models with fdboost. Journal of Statistical Software, 94(10):1 – 50, 2020.
- Linear Smoothers and Additive Models. The Annals of Statistics, 17(2):453 – 510, 1989.
- Node-gam: Neural generalized additive model for interpretable deep learning, 2021.
- Group sparse additive machine. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 197–207, 2017.
- Bayesian personalized feature interaction selection for factorization machines. SIGIR’19, page 665–674, New York, NY, USA, 2019. Association for Computing Machinery.
- Machine learning approaches for improving condition-based maintenance of naval propulsion plants. Journal of Engineering for the Maritime Environment, 2014.
- Paulo Cortez and AnĂbal de Jesus Raimundo Morais. A data mining approach to predict forest fires using meteorological data. 2007.
- UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
- Least angle regression. The Annals of statistics, 32(2):407–499, 2004.
- Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001.
- Generalized additive models. Routledge, 2017.
- Interaction-aware factorization machines for recommender systems. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 3804–3811, 2019.
- Model-based boosting 2.0. Journal of Machine Learning Research, 11(71):2109–2113, 2010.
- Towards a better understanding of linear models for recommendation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, page 776–785, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450383325.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Smoothing the Edges: A General Framework for Smooth Optimization in Sparse Regularization using Hadamard Overparametrization. 2023. doi: https://doi.org/10.48550/arXiv.2307.03571.
- Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
- Liang Lan and Yu Geng. Accurate and interpretable factorization machines. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 4139–4146, 2019.
- Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 623–631, 2013.
- Generalized linear models. Journal of the Royal Statistical Society: Series A (General), 135(3):370–384, 1972.
- Interpretml: A unified framework for machine learning interpretability. arXiv preprint arXiv:1909.09223, 2019.
- A neural networks approach to residuary resistance of sailing yachts prediction. In Proceedings of the international conference on marine engineering MARINE, volume 2007, page 250, 2007.
- Steffen Rendle. Factorization machines. In 2010 IEEE International conference on data mining, pages 995–1000. IEEE, 2010.
- Neural collaborative filtering vs. matrix factorization revisited. In Fourteenth ACM Conference on Recommender Systems, pages 240–248, 2020.
- David RĂ¼gamer. A new PHO-rmula for improved performance of semi-structured networks. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 29291–29305. PMLR, 23–29 Jul 2023.
- Factorized structured regression for large-scale varying coefficient models. In Machine Learning and Knowledge Discovery in Databases, pages 20–35, Cham, 2023a. Springer International Publishing.
- deepregression: a flexible neural network framework for semi-structured deep distributional regression. Journal of Statistical Software, 105(1):1–31, 2023b.
- Semi-Structured Deep Distributional Regression: A Combination of Additive Models and Deep Learning. The American Statistician, 2023c.
- Semiparametric Regression. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2003. ISBN 9780521785167.
- Accelerated componentwise gradient boosting using efficient data representation and momentum-based optimization. Journal of Computational and Graphical Statistics, 32(2):631–641, 2023.
- Maximum-margin matrix factorization. In NIPS, volume 17, pages 1329–1336. Citeseer, 2004.
- Functional additive regression on shape and form manifolds of planar curves. arXiv preprint arXiv:2109.02624, 2021.
- Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49:560–567, 2012.
- Multi-task additive models for robust estimation and automatic structure discovery. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 11744–11755. Curran Associates, Inc., 2020.
- Simon N. Wood. Low-rank scale-invariant tensor product smooths for generalized additive mixed models. Biometrics, 62(4):1025–1036, 2006.
- Simon N Wood. Generalized additive models: an introduction with R. Chapman and Hall/CRC, 2017.
- Generalized additive models for gigadata: Modeling the u.k. black smoke network daily data. Journal of the American Statistical Association, 112(519):1199–1210, 2017.
- I-C Yeh. Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete research, 28(12):1797–1808, 1998.
- Boostfm: Boosted factorization machines for top-n feature-based recommendation. In Proceedings of the 22nd International Conference on Intelligent User Interfaces, IUI ’17, page 45–54, New York, NY, USA, 2017. Association for Computing Machinery. ISBN 9781450343480.
- Interpretable ranking with generalized additive models. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, WSDM ’21, page 499–507, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450382977.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.