Papers
Topics
Authors
Recent
Search
2000 character limit reached

Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise

Published 27 May 2024 in cs.LG, cs.AI, and stat.ML | (2405.17672v1)

Abstract: In the real world, data is often noisy, affecting not only the quality of features but also the accuracy of labels. Current research on mitigating label errors stems primarily from advances in deep learning, and a gap exists in exploring interpretable models, particularly those rooted in decision trees. In this study, we investigate whether ideas from deep learning loss design can be applied to improve the robustness of decision trees. In particular, we show that loss correction and symmetric losses, both standard approaches, are not effective. We argue that other directions need to be explored to improve the robustness of decision trees to label noise.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (12)
  1. On symmetric losses for learning from corrupted labels. In Kamalika Chaudhuri and Ruslan Salakhutdinov (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp.  961–970. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/charoenphakdee19a.html.
  2. A comprehensive introduction to label noise. In The European Symposium on Artificial Neural Networks, 2014. URL https://api.semanticscholar.org/CorpusID:17187125.
  3. On the Robustness of Decision Tree Learning Under Label Noise. In Jinho Kim, Kyuseok Shim, Longbing Cao, Jae-Gil Lee, Xuemin Lin, and Yang-Sae Moon (eds.), Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, pp.  685–697, Cham, 2017. Springer International Publishing. ISBN 978-3-319-57454-7. doi: 10.1007/978-3-319-57454-7˙53.
  4. A Survey of Label-noise Representation Learning: Past, Present and Future, February 2021. URL http://arxiv.org/abs/2011.04406. arXiv:2011.04406 [cs].
  5. Rboost: Label noise-robust boosting algorithm based on a nonconvex loss function and the numerically stable base learners. IEEE Transactions on Neural Networks and Learning Systems, 27(11):2216–2228, 2016. doi: 10.1109/TNNLS.2015.2475750.
  6. Constraint Enforcement on Decision Trees: A Survey. ACM Computing Surveys, 54(10s):1–36, January 2022. ISSN 0360-0300, 1557-7341. doi: 10.1145/3506734. URL https://dl.acm.org/doi/10.1145/3506734.
  7. Pervasive label errors in test sets destabilize machine learning benchmarks. In Proceedings of the 35th Conference on Neural Information Processing Systems Track on Datasets and Benchmarks, December 2021.
  8. Making deep neural networks robust to label noise: A loss correction approach. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  2233–2241, 2017. doi: 10.1109/CVPR.2017.240.
  9. Learning from noisy labels with deep neural networks: A survey. IEEE Transactions on Neural Networks and Learning Systems, 34(11):8135–8153, 2023. doi: 10.1109/TNNLS.2022.3152527.
  10. OpenML: networked science in machine learning. 15(2):49–60, 2014. ISSN 1931-0145. doi: 10.1145/2641190.2641198. URL https://doi.org/10.1145/2641190.2641198.
  11. On the Robust Splitting Criterion of Random Forest. In 2019 IEEE International Conference on Data Mining (ICDM), pp.  1420–1425, November 2019. doi: 10.1109/ICDM.2019.00184. ISSN: 2374-8486.
  12. Improving robustness of random forest under label noise. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.  950–958, 2019. doi: 10.1109/WACV.2019.00106.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 5 likes about this paper.