Output-Constrained Decision Trees
Abstract: Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained Regression Trees (OCRT), addressing the limitations of traditional decision trees in constrained multi-target regression tasks. We propose three approaches: M-OCRT, which uses split-based mixed integer programming to enforce constraints; E-OCRT, which employs an exhaustive search for optimal splits and solves constrained prediction problems at each decision node; and EP-OCRT, which applies post-hoc constrained optimization to tree predictions. To illustrate their potential uses in ensemble learning, we also introduce a random forest framework working under convex feasible sets. We validate the proposed methods through a computational study both on synthetic and industry-driven hierarchical time series datasets. Our results demonstrate that imposing constraints on decision tree training results in accurate and feasible predictions.
- Ben-David, A. (1995). Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19:29–43.
- Bertsekas, D. (1982). Constrained Optimization and Lagrange Multiplier Methods. Computer Science and Applied Mathematics : A Series of Monographs and Textbooks. Academic Press.
- Optimal classification trees. Machine Learning, 106(7):1039–1082.
- Cart. Classification and Regression Trees.
- Adversarial training of gradient-boosted decision trees. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pages 2429–2432.
- Classification models with global constraints for ordinal data. In 2010 Ninth International Conference on Machine Learning and Applications, pages 71–77. IEEE.
- Robust decision trees against adversarial examples. In International Conference on Machine Learning, pages 1122–1131. PMLR.
- Physics-constrained deep learning of geomechanical logs. IEEE Transactions on Geoscience and Remote Sensing, 58(8):5932–5943.
- Friedman, M. (1940). A comparison of alternative tests of significance for the problem of m rankings. The annals of mathematical statistics, 11(1):86–92.
- Gurobi Optimization, LLC (2023). Gurobi Optimizer Reference Manual. https://www.gurobi.com.
- Kaggle (2018). Car insurance claim data. https://www.kaggle.com/datasets/xiaomengsun/car-insurance-claim-data. Accessed on May 20, 2024.
- Kimmons, R. (2012). Exam Scores. http://roycekimmons.com/tools/generated_data/exams. Accessed on May 20, 2024.
- Neural algorithm for solving differential equations. Journal of Computational Physics, 91(1):110–131.
- Constrained deep learning for wireless resource management. In ICC 2019-2019 IEEE International Conference on Communications (ICC), pages 1–6. IEEE.
- Lougee-Heimer, R. (2003). The common optimization interface for operations research: Promoting open-source software in the operations research community. IBM Journal of Research and Development, 47(1):57–66.
- Constraint enforcement on decision trees: A survey. ACM Computing Surveys (CSUR), 54(10s):1–36.
- Nemenyi, P. B. (1963). Distribution-Free Multiple Comparisons. Princeton University.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707.
- Interpretable machine learning: Fundamental principles and 10 grand challenges. Statistic Surveys, 16:1–85.
- Clustering trees with instance level constraints. In Machine Learning: ECML 2007: 18th European Conference on Machine Learning, Warsaw, Poland, September 17-21, 2007. Proceedings 18, pages 359–370. Springer.
- A physics-constrained deep learning model for simulating multiphase flow in 3d heterogeneous porous media. Fuel, 313:122693.
- A geologically-constrained deep learning algorithm for recognizing geochemical anomalies. Computers and Geosciences, 162:105100.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.