Can Uncertainty Quantification Enable Better Learning-based Index Tuning?
Abstract: Index tuning is crucial for optimizing database performance by selecting optimal indexes based on workload. The key to this process lies in an accurate and efficient benefit estimator. Traditional methods relying on what-if tools often suffer from inefficiency and inaccuracy. In contrast, learning-based models provide a promising alternative but face challenges such as instability, lack of interpretability, and complex management. To overcome these limitations, we adopt a novel approach: quantifying the uncertainty in learning-based models' results, thereby combining the strengths of both traditional and learning-based methods for reliable index tuning. We propose Beauty, the first uncertainty-aware framework that enhances learning-based models with uncertainty quantification and uses what-if tools as a complementary mechanism to improve reliability and reduce management complexity. Specifically, we introduce a novel method that combines AutoEncoder and Monte Carlo Dropout to jointly quantify uncertainty, tailored to the characteristics of benefit estimation tasks. In experiments involving sixteen models, our approach outperformed existing uncertainty quantification methods in the majority of cases. We also conducted index tuning tests on six datasets. By applying the Beauty framework, we eliminated worst-case scenarios and more than tripled the occurrence of best-case scenarios.
- T. Siddiqui, W. Wu, V. Narasayya, and S. Chaudhuri, “Distill: Low-overhead data-driven techniques for filtering and costing indexes for scalable index tuning,” Proc. VLDB Endow., vol. 15, no. 10, pp. 2019–2031, 2022.
- S. Deep, A. Gruenheid, P. Koutris, J. Naughton, and S. Viglas, “Comprehensive and efficient workload compression,” Proc. VLDB Endow., vol. 14, no. 3, pp. 418–430, 2020.
- S. Chaudhuri and V. Narasayya, “Autoadmin “what-if” index analysis utility,” SIGMOD Rec., vol. 27, no. 2, p. 367–378, jun 1998.
- W. Wu, C. Wang, T. Siddiqui, J. Wang, V. R. Narasayya, S. Chaudhuri, and P. A. Bernstein, “Budget-aware index tuning with reinforcement learning,” in SIGMOD ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Z. G. Ives, A. Bonifati, and A. E. Abbadi, Eds. ACM, 2022, pp. 1528–1541. [Online]. Available: https://doi.org/10.1145/3514221.3526128
- Z. Wang, Q. Zeng, N. Wang, H. Lu, and Y. Zhang, “CEDA: learned cardinality estimation with domain adaptation,” Proc. VLDB Endow., vol. 16, no. 12, pp. 3934–3937, 2023. [Online]. Available: https://www.vldb.org/pvldb/vol16/p3934-wang.pdf
- B. Ding, S. Das, R. Marcus, W. Wu, S. Chaudhuri, and V. R. Narasayya, “AI meets AI: leveraging query executions to improve index recommendations,” in Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, P. A. Boncz, S. Manegold, A. Ailamaki, A. Deshpande, and T. Kraska, Eds. ACM, 2019, pp. 1241–1258. [Online]. Available: https://doi.org/10.1145/3299869.3324957
- J. Shi, G. Cong, and X. Li, “Learned index benefits: Machine learning based index performance estimation,” Proc. VLDB Endow., vol. 15, no. 13, pp. 3950–3962, 2022. [Online]. Available: https://www.vldb.org/pvldb/vol15/p3950-shi.pdf
- T. Nair, D. Precup, D. L. Arnold, and T. Arbel, “Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation,” Medical Image Anal., vol. 59, 2020. [Online]. Available: https://doi.org/10.1016/j.media.2019.101557
- D. Feng, L. Rosenbaum, and K. Dietmayer, “Towards safe autonomous driving: Capture uncertainty in the deep neural network for lidar 3d vehicle detection,” in 21st International Conference on Intelligent Transportation Systems, ITSC 2018, Maui, HI, USA, November 4-7, 2018, W. Zhang, A. M. Bayen, J. J. S. Medina, and M. J. Barth, Eds. IEEE, 2018, pp. 3266–3273. [Online]. Available: https://doi.org/10.1109/ITSC.2018.8569814
- Y. Zhang and A. A. Lee, “Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning,” CoRR, vol. abs/1902.00925, 2019. [Online]. Available: http://arxiv.org/abs/1902.00925
- H. Lan, Z. Bao, and Y. Peng, “An index advisor using deep reinforcement learning,” in CIKM ’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020, M. d’Aquin, S. Dietze, C. Hauff, E. Curry, and P. Cudré-Mauroux, Eds. ACM, 2020, pp. 2105–2108. [Online]. Available: https://doi.org/10.1145/3340531.3412106
- S. Chaudhuri and V. R. Narasayya, “An efficient cost-driven index selection tool for microsoft SQL server,” in VLDB’97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25-29, 1997, Athens, Greece, M. Jarke, M. J. Carey, K. R. Dittrich, F. H. Lochovsky, P. Loucopoulos, and M. A. Jeusfeld, Eds. Morgan Kaufmann, 1997, pp. 146–155.
- K.-Y. Whang, “Index selection in relational databases,” in Foundations of Data Organization. Springer, 1987, pp. 487–500.
- S. Chaudhuri and V. Narasayya, “Anytime algorithm of database tuning advisor for microsoft sql server,” https://www.microsoft.com/en-us/research/publication/anytime-algorithm-of-database-tuning-advisor-for-microsoft-sql-server/, June 2020, visited 2024-10-16.
- D. Dash, N. Polyzotis, and A. Ailamaki, “Cophy: A scalable, portable, and interactive index advisor for large workloads,” Proc. VLDB Endow., vol. 4, no. 6, pp. 362–372, 2011. [Online]. Available: http://www.vldb.org/pvldb/vol4/p362-dash.pdf
- R. M. Perera, B. Oetomo, B. I. P. Rubinstein, and R. Borovica-Gajic, “No dba? no regret! multi-armed bandits for index tuning of analytical and htap workloads with provable guarantees,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12 855–12 872, 2023.
- R. M. Perera, B. Oetomo, B. I. P. Rubinstein, and R. Borovica-Gajic, “HMAB: self-driving hierarchy of bandits for integrated physical database design tuning,” Proc. VLDB Endow., vol. 16, no. 2, pp. 216–229, 2022. [Online]. Available: https://www.vldb.org/pvldb/vol16/p216-perera.pdf
- J. Kossmann, A. Kastius, and R. Schlosser, “SWIRL: selection of workload-aware indexes using reinforcement learning,” in Proceedings of the 25th International Conference on Extending Database Technology, EDBT 2022, Edinburgh, UK, March 29 - April 1, 2022, J. Stoyanovich, J. Teubner, P. Guagliardo, M. Nikolic, A. Pieris, J. Mühlig, F. Özcan, S. Schelter, H. V. Jagadish, and M. Zhang, Eds. OpenProceedings.org, 2022, pp. 2:155–2:168. [Online]. Available: https://doi.org/10.48786/edbt.2022.06
- V. Sharma and C. Dyreson, “Indexer++: Workload-aware online index tuning with transformers and reinforcement learning,” in Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, ser. SAC ’22. Association for Computing Machinery, 2022, pp. 372–380.
- R. Schlosser, J. Kossmann, and M. Boissier, “Efficient scalable multi-attribute index selection using recursive strategies,” in 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 2019, pp. 1238–1249.
- N. Bruno and S. Chaudhuri, “Automatic physical database tuning: A relaxation-based approach,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16, 2005, F. Özcan, Ed. ACM, 2005, pp. 227–238. [Online]. Available: https://doi.org/10.1145/1066157.1066184
- J. Gawlikowski, C. R. N. Tassi, M. Ali, J. Lee, M. Humt, J. Feng, A. M. Kruspe, R. Triebel, P. Jung, R. Roscher, M. Shahzad, W. Yang, R. Bamler, and X. Zhu, “A survey of uncertainty in deep neural networks,” Artif. Intell. Rev., vol. 56, no. S1, pp. 1513–1589, 2023. [Online]. Available: https://doi.org/10.1007/s10462-023-10562-9
- E. Hüllermeier and W. Waegeman, “Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods,” Mach. Learn., vol. 110, no. 3, pp. 457–506, 2021. [Online]. Available: https://doi.org/10.1007/s10994-021-05946-3
- M. Abdar, F. Pourpanah, S. Hussain, D. Rezazadegan, L. Liu, M. Ghavamzadeh, P. W. Fieguth, X. Cao, A. Khosravi, U. R. Acharya, V. Makarenkov, and S. Nahavandi, “A review of uncertainty quantification in deep learning: Techniques, applications and challenges,” Inf. Fusion, vol. 76, pp. 243–297, 2021. [Online]. Available: https://doi.org/10.1016/j.inffus.2021.05.008
- J. S. Denker, D. B. Schwartz, B. S. Wittner, S. A. Solla, R. E. Howard, L. D. Jackel, and J. J. Hopfield, “Large automatic learning, rule extraction, and generalization,” Complex Syst., vol. 1, no. 5, 1987. [Online]. Available: http://www.complex-systems.com/abstracts/v01_i05_a02.html
- M. Opper and C. Archambeau, “The variational gaussian approximation revisited,” Neural Comput., vol. 21, no. 3, pp. 786–792, 2009. [Online]. Available: https://doi.org/10.1162/neco.2008.08-07-592
- Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, ser. JMLR Workshop and Conference Proceedings, M. Balcan and K. Q. Weinberger, Eds., vol. 48. JMLR.org, 2016, pp. 1050–1059. [Online]. Available: http://proceedings.mlr.press/v48/gal16.html
- L. K. Hansen and P. Salamon, “Neural network ensembles,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 12, no. 10, pp. 993–1001, 1990. [Online]. Available: https://doi.org/10.1109/34.58871
- G. D. C. Cavalcanti, L. S. Oliveira, T. J. M. Moura, and G. V. Carvalho, “Combining diversity measures for ensemble pruning,” Pattern Recognit. Lett., vol. 74, pp. 38–45, 2016. [Online]. Available: https://doi.org/10.1016/j.patrec.2016.01.029
- J. Pei, C. Wang, and G. Szarvas, “Transformer uncertainty estimation with hierarchical stochastic attention,” in Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022. AAAI Press, 2022, pp. 11 147–11 155. [Online]. Available: https://doi.org/10.1609/aaai.v36i10.21364
- C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural networks,” CoRR, vol. abs/1505.05424, 2015. [Online]. Available: http://arxiv.org/abs/1505.05424
- B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, Eds., 2017, pp. 6402–6413. [Online]. Available: https://proceedings.neurips.cc/paper/2017/hash/9ef2ed4b7fd2c810847ffa5fa85bce38-Abstract.html
- F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-based optimization for general algorithm configuration,” in Learning and Intelligent Optimization - 5th International Conference, LION 5, Rome, Italy, January 17-21, 2011. Selected Papers, ser. Lecture Notes in Computer Science, C. A. C. Coello, Ed., vol. 6683. Springer, 2011, pp. 507–523. [Online]. Available: https://doi.org/10.1007/978-3-642-25566-3_40
- M. Lindauer, K. Eggensperger, M. Feurer, A. Biedenkapp, D. Deng, C. Benjamins, T. Ruhkopf, R. Sass, and F. Hutter, “SMAC3: A versatile bayesian optimization package for hyperparameter optimization,” J. Mach. Learn. Res., vol. 23, pp. 54:1–54:9, 2022. [Online]. Available: http://jmlr.org/papers/v23/21-0888.html
- G. Dong, G. Liao, H. Liu, and G. Kuang, “A review of the autoencoder and its variants: A comparative perspective from target recognition in synthetic-aperture radar images,” IEEE Geoscience and Remote Sensing Magazine, vol. 6, no. 3, pp. 44–68, 2018.
- A. Graves, “Practical variational inference for neural networks,” in Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. C. N. Pereira, and K. Q. Weinberger, Eds., 2011, pp. 2348–2356. [Online]. Available: https://proceedings.neurips.cc/paper/2011/hash/7eb3c8be3d411e8ebfab08eba5f49632-Abstract.html
- K. Fedyanin, E. Tsymbalov, and M. Panov, “Dropout strikes back: Improved uncertainty estimation via diversity sampling,” in International Conference on Analysis of Images, Social Networks and Texts. Springer, 2021, pp. 125–137.
- J. Kossmann, S. Halfpap, M. Jankrift, and R. Schlosser, “Magic mirror in my hand, which is the best in the land?: An experimental evaluation of index selection algorithms,” Proc. VLDB Endow., vol. 13, no. 12, pp. 2382–2395, 2020.
- T. Yu, Z. Zou, W. Sun, and Y. Yan, “Refactoring index tuning process with benefit estimation,” Proc. VLDB Endow., vol. 17, no. 7, pp. 1528–1541, 2024. [Online]. Available: https://www.vldb.org/pvldb/vol17/p1528-zou.pdf
- A. Shelmanov, E. Tsymbalov, D. Puzyrev, K. Fedyanin, A. Panchenko, and M. Panov, “How certain is your transformer?” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 - 23, 2021, P. Merlo, J. Tiedemann, and R. Tsarfaty, Eds. Association for Computational Linguistics, 2021, pp. 1833–1840. [Online]. Available: https://doi.org/10.18653/v1/2021.eacl-main.157
- Z. Sadri, L. Gruenwald, and E. Leal, “Drlindex: deep reinforcement learning index advisor for a cluster database,” in IDEAS 2020: 24th International Database Engineering & Applications Symposium, Seoul, Republic of Korea, August 12-14, 2020, B. C. Desai and W. Cho, Eds. ACM, 2020, pp. 11:1–11:8. [Online]. Available: https://doi.org/10.1145/3410566.3410603
- G. P. Licks, J. M. C. Couto, P. de Fátima Miehe, R. D. Paris, D. D. A. Ruiz, and F. Meneguzzi, “Smartix: A database indexing agent based on reinforcement learning,” Appl. Intell., vol. 50, no. 8, pp. 2575–2588, 2020. [Online]. Available: https://doi.org/10.1007/s10489-020-01674-8
- T. Siddiqui, S. Jo, W. Wu, C. Wang, V. Narasayya, and S. Chaudhuri, “Isum: Efficiently compressing large and complex workloads for scalable index tuning,” in Proceedings of the 2022 International Conference on Management of Data, ser. SIGMOD ’22. Association for Computing Machinery, 2022, pp. 660–673.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.